Fashion-MNIST Dataset
Fashion-MNIST is a dataset of Zalando's article images designed as a drop-in replacement for the original MNIST dataset. It's ideal for quick prototyping, learning, and benchmarking classification algorithms.
Overview
Dataset Statistics:
- 60,000 training examples
- 10,000 test examples
- 10 classes: T-shirt/top, Trouser, Pullover, Dress, Coat, Sandal, Shirt, Sneaker, Bag, Ankle boot
- 28×28 grayscale images
- ~60MB total size (downloads quickly)
Class Labels:
0: T-shirt/top1: Trouser2: Pullover3: Dress4: Coat5: Sandal6: Shirt7: Sneaker8: Bag9: Ankle boot
Reference: Fashion-MNIST GitHub
Installation
Fashion-MNIST requires no additional dependencies beyond the core OpenTryOn package. The dataset will be automatically downloaded on first use.
Usage
Class-Based Approach (Recommended)
from tryon.datasets import FashionMNIST
# Create dataset instance (downloads automatically)
dataset = FashionMNIST(download=True)
# Get dataset info
info = dataset.get_info()
print(f"Train size: {info['train_size']}") # 60000
print(f"Test size: {info['test_size']}") # 10000
print(f"Classes: {info['num_classes']}") # 10
# Load the dataset
(train_images, train_labels), (test_images, test_labels) = dataset.load(
normalize=True,
flatten=False
)
print(f"Training set: {train_images.shape}") # (60000, 28, 28)
print(f"Test set: {test_images.shape}") # (10000, 28, 28)
# Get class names
print(dataset.get_class_name(0)) # 'T-shirt/top'
print(dataset.get_class_names()) # ['T-shirt/top', 'Trouser', ...]
Function-Based Approach
from tryon.datasets import load_fashion_mnist, get_fashion_mnist_class_name
# Load dataset
(train_images, train_labels), (test_images, test_labels) = load_fashion_mnist(
normalize=True,
flatten=False
)
# Get class name
class_name = get_fashion_mnist_class_name(0) # 'T-shirt/top'
API Reference
Class: FashionMNIST
Fashion-MNIST dataset adapter class.
Constructor
FashionMNIST(data_dir: Optional[str] = None, download: bool = True)
Parameters:
data_dir(str, optional): Directory to store the dataset. Defaults to~/.opentryon/datasets/fashion_mnistdownload(bool): IfTrue, download the dataset if it doesn't exist. Default:True
Example:
# Use default directory
dataset = FashionMNIST(download=True)
# Use custom directory
dataset = FashionMNIST(
data_dir="./my_datasets/fashion_mnist",
download=True
)
Methods
load(normalize=False, flatten=False)
Load Fashion-MNIST dataset into memory.
Parameters:
normalize(bool): IfTrue, normalize pixel values to [0, 1] range. Default:False- When
False: pixel values are integers 0-255 - When
True: pixel values are floats 0.0-1.0
- When
flatten(bool): IfTrue, flatten images to 1D arrays. Default:False- When
False: images are shape(28, 28) - When
True: images are shape(784,)
- When
Returns:
(train_data, test_data)tuple where each is(images, labels)train_images: numpy array of shape(60000, 28, 28)or(60000, 784)ifflatten=Truetrain_labels: numpy array of shape(60000,)with integer values 0-9test_images: numpy array of shape(10000, 28, 28)or(10000, 784)ifflatten=Truetest_labels: numpy array of shape(10000,)with integer values 0-9
Data Types:
- Images:
uint8ifnormalize=False,float32ifnormalize=True - Labels:
uint8
Example:
dataset = FashionMNIST()
# Load without normalization
(train_images, train_labels), (test_images, test_labels) = dataset.load(
normalize=False,
flatten=False
)
print(train_images.dtype) # uint8
print(train_images.shape) # (60000, 28, 28)
# Load with normalization
(train_images, train_labels), (test_images, test_labels) = dataset.load(
normalize=True,
flatten=False
)
print(train_images.dtype) # float32
print(train_images.min(), train_images.max()) # 0.0 1.0
# Load flattened
(train_images, train_labels), (test_images, test_labels) = dataset.load(
normalize=True,
flatten=True
)
print(train_images.shape) # (60000, 784)
get_info()
Get dataset information and metadata.
Returns:
dict: Dictionary containing:name: Dataset name ("Fashion-MNIST")train_size: Number of training examples (60000)test_size: Number of test examples (10000)num_classes: Number of classes (10)image_shape: Image shape tuple(28, 28)normalized: Whether data is normalized (updated afterload())flattened: Whether data is flattened (updated afterload())
Example:
dataset = FashionMNIST()
info = dataset.get_info()
print(info)
# {
# 'name': 'Fashion-MNIST',
# 'train_size': 60000,
# 'test_size': 10000,
# 'num_classes': 10,
# 'image_shape': (28, 28),
# 'normalized': False,
# 'flattened': False
# }
get_class_name(class_id: int)
Get the name of a class by its ID.
Parameters:
class_id(int): Class ID (0-9)
Returns:
str: Class name
Example:
dataset = FashionMNIST()
print(dataset.get_class_name(0)) # 'T-shirt/top'
print(dataset.get_class_name(9)) # 'Ankle boot'
get_class_names()
Get all class names.
Returns:
list[str]: List of all class names
Example:
dataset = FashionMNIST()
class_names = dataset.get_class_names()
print(class_names)
# ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
# 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
Functions
load_fashion_mnist(normalize=False, flatten=False)
Convenience function to load Fashion-MNIST dataset.
Parameters:
normalize(bool): Normalize pixel values to [0, 1]. Default:Falseflatten(bool): Flatten images to 1D arrays. Default:False
Returns:
(train_data, test_data)tuple where each is(images, labels)
Example:
from tryon.datasets import load_fashion_mnist
(train_images, train_labels), (test_images, test_labels) = load_fashion_mnist(
normalize=True,
flatten=False
)
get_fashion_mnist_class_name(class_id: int)
Get class name by ID.
Parameters:
class_id(int): Class ID (0-9)
Returns:
str: Class name
get_fashion_mnist_class_names()
Get all class names.
Returns:
list[str]: List of all class names
Best Practices
Normalization
For machine learning models, it's recommended to normalize pixel values:
dataset = FashionMNIST()
(train_images, train_labels), (test_images, test_labels) = dataset.load(
normalize=True, # Normalize to [0, 1]
flatten=False
)
Memory Management
Fashion-MNIST is small enough (~60MB) to load entirely into memory. However, if you're working with limited memory:
# Load only what you need
dataset = FashionMNIST()
(train_images, train_labels), _ = dataset.load(normalize=True)
# Use only training set, ignore test set
Visualization
import matplotlib.pyplot as plt
from tryon.datasets import FashionMNIST
dataset = FashionMNIST()
(train_images, train_labels), (test_images, test_labels) = dataset.load(
normalize=False,
flatten=False
)
# Visualize a sample
plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i + 1)
plt.imshow(train_images[i], cmap='gray')
plt.title(dataset.get_class_name(train_labels[i]))
plt.axis('off')
plt.tight_layout()
plt.show()
Examples
See Dataset Examples for complete usage examples.
Troubleshooting
Download Issues
If the dataset fails to download:
- Check your internet connection
- Verify write permissions to the data directory
- Try specifying a custom
data_dir:dataset = FashionMNIST(data_dir="./my_datasets/fashion_mnist", download=True)
Memory Issues
Fashion-MNIST is small, but if you encounter memory issues:
- Use
normalize=Falseto keep data asuint8(smaller memory footprint) - Load only the split you need (train or test)