XLSB Microsoft Excel 2007+ document (binary format)
AI-powered detection and analysis of Microsoft Excel 2007+ document (binary format) files.
Instant XLSB File Detection
Use our advanced AI-powered tool to instantly detect and analyze Microsoft Excel 2007+ document (binary format) files with precision and speed.
File Information
Microsoft Excel 2007+ document (binary format)
Document
.xlsb
application/vnd.ms-excel.sheet.binary.macroEnabled.12
XLSB (Excel Binary Workbook) Format
Overview
Excel Binary Workbook (XLSB) is a binary file format used by Microsoft Excel 2007 and later versions. It provides faster performance and smaller file sizes compared to XLSX while maintaining full compatibility with Excel features including formulas, charts, pivot tables, and macros.
Technical Specifications
- Format Type: Binary spreadsheet format
- File Extension:
.xlsb
- MIME Type:
application/vnd.ms-excel.sheet.binary.macroEnabled.12
- Container: ZIP-based package format
- Maximum Rows: 1,048,576 (2^20)
- Maximum Columns: 16,384 (XFD)
- Compression: ZIP compression with binary encoding
- Macro Support: Full VBA macro support
Format Structure
XLSB files use a ZIP container with:
- Binary worksheet streams (faster to read/write)
- Relationships and content types (XML)
- Shared strings in binary format
- Formula storage in binary encoding
- Chart and drawing components
- VBA macro projects (if present)
History and Development
- 2007: Introduced with Excel 2007 (Office 2007)
- 2010: Enhanced in Excel 2010 with improved performance
- 2013: Additional features in Excel 2013
- 2016: Continued improvements in Excel 2016
- Present: Maintained as high-performance Excel format
Advantages over XLSX
- Faster performance: 2-3x faster file opening and saving
- Smaller file size: 20-25% smaller than equivalent XLSX files
- Better calculation speed: Faster formula calculations
- Full feature support: All Excel features including macros
- Reduced memory usage: More efficient memory utilization
Use Cases
- Large spreadsheets requiring fast performance
- Financial models with complex calculations
- Data processing applications with many formulas
- Applications requiring VBA macro support
- High-frequency file operations
- Memory-constrained environments
Code Examples
C# XLSB Processing with ClosedXML
using ClosedXML.Excel;
using System;
using System.IO;
using System.Linq;
using System.Collections.Generic;
public class XLSBProcessor
{
public class WorkbookInfo
{
public string FileName { get; set; }
public int WorksheetCount { get; set; }
public long FileSize { get; set; }
public List<WorksheetInfo> Worksheets { get; set; } = new List<WorksheetInfo>();
public bool HasMacros { get; set; }
public DateTime CreationTime { get; set; }
}
public class WorksheetInfo
{
public string Name { get; set; }
public int UsedRows { get; set; }
public int UsedColumns { get; set; }
public int CellsWithData { get; set; }
public int FormulaCells { get; set; }
public Dictionary<string, int> DataTypes { get; set; } = new Dictionary<string, int>();
}
public static WorkbookInfo AnalyzeXLSBFile(string filePath)
{
var info = new WorkbookInfo
{
FileName = Path.GetFileName(filePath),
FileSize = new FileInfo(filePath).Length,
CreationTime = File.GetCreationTime(filePath)
};
try
{
using (var workbook = new XLWorkbook(filePath))
{
info.WorksheetCount = workbook.Worksheets.Count;
info.HasMacros = HasVBAProject(filePath);
foreach (var worksheet in workbook.Worksheets)
{
var wsInfo = AnalyzeWorksheet(worksheet);
info.Worksheets.Add(wsInfo);
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error analyzing XLSB file: {ex.Message}");
}
return info;
}
private static WorksheetInfo AnalyzeWorksheet(IXLWorksheet worksheet)
{
var info = new WorksheetInfo
{
Name = worksheet.Name,
DataTypes = new Dictionary<string, int>
{
["Text"] = 0,
["Number"] = 0,
["Date"] = 0,
["Boolean"] = 0,
["Formula"] = 0,
["Error"] = 0
}
};
var usedRange = worksheet.RangeUsed();
if (usedRange != null)
{
info.UsedRows = usedRange.RowCount();
info.UsedColumns = usedRange.ColumnCount();
foreach (var cell in usedRange.Cells())
{
if (!cell.IsEmpty())
{
info.CellsWithData++;
if (cell.HasFormula)
{
info.FormulaCells++;
info.DataTypes["Formula"]++;
}
else
{
switch (cell.DataType)
{
case XLDataType.Text:
info.DataTypes["Text"]++;
break;
case XLDataType.Number:
info.DataTypes["Number"]++;
break;
case XLDataType.DateTime:
info.DataTypes["Date"]++;
break;
case XLDataType.Boolean:
info.DataTypes["Boolean"]++;
break;
case XLDataType.Error:
info.DataTypes["Error"]++;
break;
}
}
}
}
}
return info;
}
private static bool HasVBAProject(string filePath)
{
try
{
using (var package = System.IO.Compression.ZipFile.OpenRead(filePath))
{
return package.Entries.Any(e => e.FullName.StartsWith("xl/vbaProject"));
}
}
catch
{
return false;
}
}
public static void ConvertXLSBToXLSX(string xlsbPath, string xlsxPath)
{
try
{
using (var workbook = new XLWorkbook(xlsbPath))
{
// Remove VBA project if present (XLSX doesn't support macros)
workbook.SaveAs(xlsxPath);
Console.WriteLine($"Converted {xlsbPath} to {xlsxPath}");
}
}
catch (Exception ex)
{
Console.WriteLine($"Error converting file: {ex.Message}");
}
}
public static void CreatePerformanceBenchmark(string outputPath, int rows, int columns)
{
var stopwatch = System.Diagnostics.Stopwatch.StartNew();
using (var workbook = new XLWorkbook())
{
var worksheet = workbook.Worksheets.Add("Performance Test");
// Generate large dataset
for (int row = 1; row <= rows; row++)
{
for (int col = 1; col <= columns; col++)
{
if (row == 1)
{
// Headers
worksheet.Cell(row, col).Value = $"Column_{col}";
}
else
{
// Data with various types
switch (col % 4)
{
case 0:
worksheet.Cell(row, col).Value = $"Text_{row}_{col}";
break;
case 1:
worksheet.Cell(row, col).Value = row * col;
break;
case 2:
worksheet.Cell(row, col).Value = DateTime.Now.AddDays(row);
break;
case 3:
worksheet.Cell(row, col).FormulaA1 = $"B{row}*2";
break;
}
}
}
if (row % 1000 == 0)
{
Console.WriteLine($"Generated {row} rows...");
}
}
// Save as XLSB
string xlsbPath = outputPath.Replace(".xlsx", ".xlsb");
workbook.SaveAs(xlsbPath);
stopwatch.Stop();
Console.WriteLine($"Created XLSB file with {rows} rows and {columns} columns");
Console.WriteLine($"Time taken: {stopwatch.ElapsedMilliseconds} ms");
Console.WriteLine($"File size: {new FileInfo(xlsbPath).Length / 1024 / 1024} MB");
}
}
public static void ExtractFormulas(string xlsbPath)
{
try
{
using (var workbook = new XLWorkbook(xlsbPath))
{
foreach (var worksheet in workbook.Worksheets)
{
Console.WriteLine($"\nFormulas in worksheet '{worksheet.Name}':");
var formulaCells = worksheet.CellsUsed(c => c.HasFormula);
foreach (var cell in formulaCells)
{
Console.WriteLine($" {cell.Address}: {cell.FormulaA1} = {cell.Value}");
}
}
}
}
catch (Exception ex)
{
Console.WriteLine($"Error extracting formulas: {ex.Message}");
}
}
public static void OptimizeXLSBFile(string inputPath, string outputPath)
{
try
{
using (var workbook = new XLWorkbook(inputPath))
{
foreach (var worksheet in workbook.Worksheets)
{
// Remove empty rows and columns
var usedRange = worksheet.RangeUsed();
if (usedRange != null)
{
// Delete unused rows beyond the used range
var lastRow = usedRange.LastRow().RowNumber();
if (lastRow < worksheet.LastRowUsed().RowNumber())
{
worksheet.Rows(lastRow + 1, worksheet.LastRowUsed().RowNumber()).Delete();
}
// Delete unused columns beyond the used range
var lastCol = usedRange.LastColumn().ColumnNumber();
if (lastCol < worksheet.LastColumnUsed().ColumnNumber())
{
worksheet.Columns(lastCol + 1, worksheet.LastColumnUsed().ColumnNumber()).Delete();
}
}
// Remove unused styles
// Note: ClosedXML handles this automatically during save
}
workbook.SaveAs(outputPath);
var originalSize = new FileInfo(inputPath).Length;
var optimizedSize = new FileInfo(outputPath).Length;
var savings = (originalSize - optimizedSize) * 100.0 / originalSize;
Console.WriteLine($"Optimized XLSB file saved: {outputPath}");
Console.WriteLine($"Size reduction: {savings:F1}% ({originalSize / 1024} KB → {optimizedSize / 1024} KB)");
}
}
catch (Exception ex)
{
Console.WriteLine($"Error optimizing file: {ex.Message}");
}
}
}
// Usage example
class Program
{
static void Main(string[] args)
{
// Analyze existing XLSB file
if (File.Exists("sample.xlsb"))
{
var info = XLSBProcessor.AnalyzeXLSBFile("sample.xlsb");
Console.WriteLine($"File: {info.FileName}");
Console.WriteLine($"Size: {info.FileSize / 1024} KB");
Console.WriteLine($"Worksheets: {info.WorksheetCount}");
Console.WriteLine($"Has Macros: {info.HasMacros}");
foreach (var ws in info.Worksheets)
{
Console.WriteLine($"\nWorksheet: {ws.Name}");
Console.WriteLine($" Used Range: {ws.UsedRows} × {ws.UsedColumns}");
Console.WriteLine($" Cells with Data: {ws.CellsWithData}");
Console.WriteLine($" Formula Cells: {ws.FormulaCells}");
}
}
// Create performance test file
XLSBProcessor.CreatePerformanceBenchmark("performance_test.xlsb", 10000, 20);
// Convert XLSB to XLSX
XLSBProcessor.ConvertXLSBToXLSX("performance_test.xlsb", "performance_test.xlsx");
// Extract formulas
XLSBProcessor.ExtractFormulas("performance_test.xlsb");
// Optimize file
XLSBProcessor.OptimizeXLSBFile("performance_test.xlsb", "optimized.xlsb");
}
}
Python XLSB Processing with pyxlsb
import pyxlsb
import pandas as pd
import os
from datetime import datetime
import zipfile
class XLSBProcessor:
def __init__(self, file_path):
self.file_path = file_path
def analyze_file(self):
"""Analyze XLSB file structure and content."""
file_info = {
'filename': os.path.basename(self.file_path),
'file_size': os.path.getsize(self.file_path),
'creation_time': datetime.fromtimestamp(os.path.getctime(self.file_path)),
'modification_time': datetime.fromtimestamp(os.path.getmtime(self.file_path)),
'worksheets': [],
'has_macros': self._check_for_macros()
}
try:
with pyxlsb.open_workbook(self.file_path) as workbook:
file_info['worksheet_count'] = len(workbook.sheets)
for sheet_name in workbook.sheets:
sheet_info = self._analyze_worksheet(workbook, sheet_name)
file_info['worksheets'].append(sheet_info)
except Exception as e:
print(f"Error analyzing file: {e}")
return file_info
def _analyze_worksheet(self, workbook, sheet_name):
"""Analyze individual worksheet."""
sheet_info = {
'name': sheet_name,
'rows': 0,
'columns': 0,
'cells_with_data': 0,
'data_types': {
'text': 0,
'number': 0,
'date': 0,
'boolean': 0,
'error': 0,
'empty': 0
},
'sample_data': []
}
try:
with workbook.get_sheet(sheet_name) as sheet:
rows_data = []
max_col = 0
for row_idx, row in enumerate(sheet.rows()):
if row_idx >= 1000: # Limit analysis for performance
break
row_data = []
for col_idx, cell in enumerate(row):
max_col = max(max_col, col_idx + 1)
if cell.v is not None:
sheet_info['cells_with_data'] += 1
# Classify data types
if isinstance(cell.v, str):
sheet_info['data_types']['text'] += 1
elif isinstance(cell.v, (int, float)):
sheet_info['data_types']['number'] += 1
elif isinstance(cell.v, datetime):
sheet_info['data_types']['date'] += 1
elif isinstance(cell.v, bool):
sheet_info['data_types']['boolean'] += 1
else:
sheet_info['data_types']['error'] += 1
else:
sheet_info['data_types']['empty'] += 1
row_data.append(cell.v)
rows_data.append(row_data)
# Collect sample data (first 10 rows)
if row_idx < 10:
sheet_info['sample_data'].append(row_data[:10])
sheet_info['rows'] = len(rows_data)
sheet_info['columns'] = max_col
except Exception as e:
print(f"Error analyzing worksheet {sheet_name}: {e}")
return sheet_info
def _check_for_macros(self):
"""Check if file contains VBA macros."""
try:
with zipfile.ZipFile(self.file_path, 'r') as zip_file:
for file_name in zip_file.namelist():
if 'vbaProject' in file_name:
return True
except:
pass
return False
def convert_to_dataframes(self):
"""Convert all worksheets to pandas DataFrames."""
dataframes = {}
try:
with pyxlsb.open_workbook(self.file_path) as workbook:
for sheet_name in workbook.sheets:
try:
# Read worksheet data
data = []
with workbook.get_sheet(sheet_name) as sheet:
for row in sheet.rows():
data.append([cell.v for cell in row])
if data:
# Convert to DataFrame
df = pd.DataFrame(data[1:], columns=data[0] if data else None)
dataframes[sheet_name] = df
print(f"Converted worksheet '{sheet_name}' to DataFrame: {df.shape}")
except Exception as e:
print(f"Error converting worksheet {sheet_name}: {e}")
except Exception as e:
print(f"Error opening workbook: {e}")
return dataframes
def extract_to_csv(self, output_dir=None):
"""Extract all worksheets to CSV files."""
if not output_dir:
output_dir = os.path.dirname(self.file_path)
os.makedirs(output_dir, exist_ok=True)
base_name = os.path.splitext(os.path.basename(self.file_path))[0]
csv_files = []
try:
with pyxlsb.open_workbook(self.file_path) as workbook:
for sheet_name in workbook.sheets:
safe_sheet_name = "".join(c for c in sheet_name if c.isalnum() or c in (' ', '-', '_')).rstrip()
csv_filename = f"{base_name}_{safe_sheet_name}.csv"
csv_path = os.path.join(output_dir, csv_filename)
try:
with workbook.get_sheet(sheet_name) as sheet:
with open(csv_path, 'w', newline='', encoding='utf-8') as csvfile:
import csv
writer = csv.writer(csvfile)
for row in sheet.rows():
writer.writerow([str(cell.v) if cell.v is not None else '' for cell in row])
csv_files.append(csv_path)
print(f"Exported worksheet '{sheet_name}' to {csv_path}")
except Exception as e:
print(f"Error exporting worksheet {sheet_name}: {e}")
except Exception as e:
print(f"Error processing workbook: {e}")
return csv_files
def compare_performance_with_xlsx(self, xlsx_path):
"""Compare performance between XLSB and XLSX files."""
import time
results = {
'xlsb': {'file_size': 0, 'read_time': 0},
'xlsx': {'file_size': 0, 'read_time': 0}
}
# XLSB performance
results['xlsb']['file_size'] = os.path.getsize(self.file_path)
start_time = time.time()
try:
dfs_xlsb = self.convert_to_dataframes()
results['xlsb']['read_time'] = time.time() - start_time
results['xlsb']['sheets_read'] = len(dfs_xlsb)
except Exception as e:
print(f"Error reading XLSB: {e}")
results['xlsb']['read_time'] = float('inf')
# XLSX performance
if os.path.exists(xlsx_path):
results['xlsx']['file_size'] = os.path.getsize(xlsx_path)
start_time = time.time()
try:
dfs_xlsx = pd.read_excel(xlsx_path, sheet_name=None, engine='openpyxl')
results['xlsx']['read_time'] = time.time() - start_time
results['xlsx']['sheets_read'] = len(dfs_xlsx)
except Exception as e:
print(f"Error reading XLSX: {e}")
results['xlsx']['read_time'] = float('inf')
return results
def batch_process_xlsb_files(directory):
"""Process all XLSB files in a directory."""
xlsb_files = [f for f in os.listdir(directory) if f.lower().endswith('.xlsb')]
results = []
for filename in xlsb_files:
file_path = os.path.join(directory, filename)
print(f"\nProcessing: {filename}")
try:
processor = XLSBProcessor(file_path)
file_info = processor.analyze_file()
# Extract to CSV
csv_files = processor.extract_to_csv()
file_info['csv_files'] = csv_files
results.append(file_info)
print(f" File size: {file_info['file_size'] / 1024:.1f} KB")
print(f" Worksheets: {file_info['worksheet_count']}")
print(f" Has macros: {file_info['has_macros']}")
print(f" CSV files created: {len(csv_files)}")
except Exception as e:
print(f" Error processing {filename}: {e}")
return results
def create_performance_test_file(filename, rows=10000, cols=20):
"""Create a large XLSB file for performance testing."""
import xlsxwriter
# Create XLSX first, then convert
xlsx_filename = filename.replace('.xlsb', '.xlsx')
workbook = xlsxwriter.Workbook(xlsx_filename)
worksheet = workbook.add_worksheet('Performance Test')
# Add headers
for col in range(cols):
worksheet.write(0, col, f'Column_{col+1}')
# Add data
for row in range(1, rows + 1):
for col in range(cols):
if col % 4 == 0:
worksheet.write(row, col, f'Text_{row}_{col}')
elif col % 4 == 1:
worksheet.write(row, col, row * col)
elif col % 4 == 2:
worksheet.write(row, col, f'=B{row+1}*2')
else:
worksheet.write(row, col, row / (col + 1))
if row % 1000 == 0:
print(f"Generated {row} rows...")
workbook.close()
print(f"Created XLSX file: {xlsx_filename}")
print("Note: Convert to XLSB using Excel for true performance testing")
return xlsx_filename
# Usage example
if __name__ == "__main__":
# Create test file
test_file = create_performance_test_file("test_performance.xlsb", 5000, 15)
# Process XLSB file (if available)
xlsb_files = [f for f in os.listdir('.') if f.endswith('.xlsb')]
if xlsb_files:
for xlsb_file in xlsb_files:
print(f"\nAnalyzing: {xlsb_file}")
processor = XLSBProcessor(xlsb_file)
# Analyze file
info = processor.analyze_file()
print(f"File: {info['filename']}")
print(f"Size: {info['file_size'] / 1024:.1f} KB")
print(f"Worksheets: {info['worksheet_count']}")
print(f"Has macros: {info['has_macros']}")
for ws in info['worksheets']:
print(f"\n Worksheet: {ws['name']}")
print(f" Dimensions: {ws['rows']} × {ws['columns']}")
print(f" Cells with data: {ws['cells_with_data']}")
print(f" Data types: {ws['data_types']}")
# Convert to DataFrames
dataframes = processor.convert_to_dataframes()
# Extract to CSV
csv_files = processor.extract_to_csv()
print(f"\nCreated {len(csv_files)} CSV files")
else:
print("No XLSB files found in current directory")
Performance Comparison
XLSB vs XLSX Benchmark
# PowerShell script for performance comparison
$xlsbFile = "test.xlsb"
$xlsxFile = "test.xlsx"
# File size comparison
$xlsbSize = (Get-Item $xlsbFile).Length
$xlsxSize = (Get-Item $xlsxFile).Length
$sizeRatio = [math]::Round(($xlsxSize - $xlsbSize) / $xlsbSize * 100, 1)
Write-Host "File Size Comparison:"
Write-Host "XLSB: $([math]::Round($xlsbSize/1MB, 2)) MB"
Write-Host "XLSX: $([math]::Round($xlsxSize/1MB, 2)) MB"
Write-Host "XLSX is $sizeRatio% larger than XLSB"
# Opening time comparison (requires Excel)
$excel = New-Object -ComObject Excel.Application
$excel.Visible = $false
$excel.DisplayAlerts = $false
# Test XLSB opening time
$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()
$workbook1 = $excel.Workbooks.Open((Resolve-Path $xlsbFile).Path)
$xlsbTime = $stopwatch.ElapsedMilliseconds
$workbook1.Close()
# Test XLSX opening time
$stopwatch.Restart()
$workbook2 = $excel.Workbooks.Open((Resolve-Path $xlsxFile).Path)
$xlsxTime = $stopwatch.ElapsedMilliseconds
$workbook2.Close()
$excel.Quit()
Write-Host "`nOpening Time Comparison:"
Write-Host "XLSB: $xlsbTime ms"
Write-Host "XLSX: $xlsxTime ms"
Write-Host "XLSB is $([math]::Round(($xlsxTime - $xlsbTime) / $xlsxTime * 100, 1))% faster"
Security Considerations
- XLSB files can contain VBA macros with security risks
- Validate file structure and content before processing
- Use trusted sources and scan for malware
- Consider disabling macros in production environments
- Implement proper error handling for corrupted files
- Be aware of potential zip bomb attacks
Best Practices
- Use XLSB for large files requiring frequent access
- Prefer XLSB for files with complex calculations
- Test performance differences in your specific use case
- Maintain XLSX versions for better compatibility
- Document any macros or custom functionality
- Regular file integrity checks for critical data
- Consider file size vs. compatibility trade-offs
Migration and Compatibility
- From XLS: Direct conversion available
- To XLSX: Full conversion with macro removal
- Cross-platform: Limited support outside Windows/Excel
- Alternative tools: OpenPyXL, ClosedXML, pyxlsb
- Cloud compatibility: Limited support in cloud platforms
Troubleshooting Common Issues
- Library compatibility: Use XLSB-specific libraries
- Macro execution: Test macro functionality after conversion
- Performance degradation: Check for file corruption
- Memory issues: Use streaming readers for very large files
- Character encoding: Handle international characters properly
AI-Powered XLSB File Analysis
Instant Detection
Quickly identify Microsoft Excel 2007+ document (binary format) files with high accuracy using Google's advanced Magika AI technology.
Security Analysis
Analyze file structure and metadata to ensure the file is legitimate and safe to use.
Detailed Information
Get comprehensive details about file type, MIME type, and other technical specifications.
Privacy First
All analysis happens in your browser - no files are uploaded to our servers.
Related File Types
Explore other file types in the Document category and discover more formats:
Start Analyzing XLSB Files Now
Use our free AI-powered tool to detect and analyze Microsoft Excel 2007+ document (binary format) files instantly with Google's Magika technology.
⚡ Try File Detection Tool