Data module
|
OutlierDataset for combining instances from (known) training and (unknown) outlier data. |
|
DataWrapper for base datasets. |
|
Method for obtaining configurations for OSR model evaluation using Holdout protocol (both KKC and UUC from single dataset) for One-class classification. |
|
Method for obtaining configurations for OSR model evaluation using Holdout protocol (both KKC and UUC from single dataset). |
|
Method for obtaining Cross-validation folds using Holdout protocol (both KKC and UUC from single dataset). |
|
Method for obtaining configurations for OSR model evaluation using Outlier protocol. |
|
Method for obtaining Cross-validation folds using Outlier protocol (KKC obtained from base_dataset and UUC from outlier_dataset). |
- class torchosr.data.DataWrapper(root: str, base_dataset, get_classes, known_classes, return_only_known, indexes='all', onehot=False, onehot_num_classes=None)
Bases:
VisionDatasetDataWrapper for base datasets.
- Parameters:
root (string) – Data directory.
base_dataset (VisionDataset) – base dataset implementing __getitem__ function.
indexes (List) – Indexes of wrapped dataset objects that will be returned.
get_classes (List) – Considered class indexes from base dataset (known + unknown).
known_classes (List) – Considered known class indexes from base dataset.
return_only_known (boolean) – If True will return only known instances (for training). If False will assign new class index (equal to the number of classes) to unknown samples, and return them with known data (for testing).
onehot (boolean) – If True will perform one-hot encoding on labels.
onehot_num_classes (int) – Number of classes for one-hot encoding (in case outlier data is generated for testing).
- class torchosr.data.OutlierDataset(root: str, dataset, outliers, shuffle: bool = True, random_state: int | None = None, unknown_label: int | None = None, onehot=False, onehot_num_classes=None)
Bases:
VisionDatasetOutlierDataset for combining instances from (known) training and (unknown) outlier data.
- Parameters:
root (string) – Data directory.
dataset (VisionDataset) – Dataset of known-class testing examples.
outliers (VisionDataset) – Dataset of unknown-class testing examples, which will be labeled as unknowns.
shuffle (boolean) – If True, the final data will be shuffled.
random_state (int) – Random state (for shuffle).
unknown_label (int) – Label with which the unknowns will be marked.
onehot (boolean) – If True will perform one-hot encoding on labels.
onehot_num_classes (int) – Number of classes for one-hot encoding (in case outlier data is generated for testing).
- torchosr.data.configure_division(base_dataset, repeats, n_openness=None, seed=None, min_known_classes=2)
Method for obtaining configurations for OSR model evaluation using Holdout protocol (both KKC and UUC from single dataset).
- Parameters:
base_dataset (VisionDataset) – Base dataset
repeats (int) – Number of randol selections of classes for single openness (KKC/UUC class cardinality)
n_openness (int) – Number of KKC/UUC class cardinality to generate. If None will return all possible configurations.
seed (int) – Random state
min_known_classes (int) – Minimum number of known classes
- Return type:
List
- Returns:
Lit of dataset configurations – each containing sets of KKC and UUC – and their Openness
- torchosr.data.configure_division_outlier(base_dataset, outlier_dataset, repeats, n_openness=None, seed=None, min_known_classes=2)
Method for obtaining configurations for OSR model evaluation using Outlier protocol. KKC come from base_dataset, UUC from outlier_dataset.
- Parameters:
base_dataset (VisionDataset) – Dataset describing KKC instances
outlier_dataset (VisionDataset) – Dataset describing UUC instances
repeats (int) – Number of randol selections of classes for single openness (KKC/UUC class cardinality)
n_openness (int) – Number of KKC/UUC class cardinality to generate
seed (int) – Random state
min_known_classes (int) – Minimum number of known classes
- Return type:
List
- Returns:
Lit of dataset configurations – each containing sets of KKC and UUC – and their Openness
- torchosr.data.configure_oneclass_division(base_dataset, repeats, n_openness=None, seed=None)
Method for obtaining configurations for OSR model evaluation using Holdout protocol (both KKC and UUC from single dataset) for One-class classification. Set of KKC always contains a single class.
- Parameters:
- Return type:
List
- Returns:
List of dataset configurations – each containing sets of KKC and UUC – and their Openness
- torchosr.data.get_train_test(base_dataset, kkc_indexes, uuc_indexes, root, tunning, fold, seed=1410, n_folds=5)
Method for obtaining Cross-validation folds using Holdout protocol (both KKC and UUC from single dataset).
- Parameters:
base_dataset (VisionDataset) – Base dataset
kkc_indexes (List) – List of labels constituting Known Classes
uuc_indexes (List) – List of labels constituting Unknown Classes
root (string) – Datasets folder
tunning (boolean) – Flag. If True will split 10% of data for tunning, otherwise will split 90% of data.
fold (int) – Fold index
n_folds (int) – Number of folds
seed (int) – Random state
- Return type:
List
- Returns:
Train dataset, Test dataset
- torchosr.data.get_train_test_outlier(base_dataset, outlier_dataset, kkc_indexes, uuc_indexes, root, tunning, fold, seed=1410, n_folds=5)
Method for obtaining Cross-validation folds using Outlier protocol (KKC obtained from base_dataset and UUC from outlier_dataset).
- Parameters:
base_dataset (VisionDataset) – Dataset describing KKC instances
outlier_dataset (VisionDataset) – Dataset describing UUC instances
kkc_indexes (List) – List of labels constituting Known Classes
uuc_indexes (List) – List of labels constituting Unknown Classes
root (string) – Datasets folder
tunning (boolean) – Flag. If True will split 10% of data for tunning, otherwise will split 90% of data.
fold (int) – Fold index
n_folds (int) – Number of folds
seed (int) – Random state
- Return type:
List
- Returns:
Train dataset, Test dataset