pytorch_pfn_extras.dataset.TabularDataset¶
- class pytorch_pfn_extras.dataset.TabularDataset(*args, **kwds)¶
An abstract class that represents tabular dataset.
This class represents a tabular dataset. In a tabular dataset, all examples have the same number of elements. For example, all examples of the dataset below have three elements (
a[i]
,b[i]
, andc[i]
).a
b
c
0
a[0]
b[0]
c[0]
1
a[1]
b[1]
c[1]
2
a[2]
b[2]
c[2]
3
a[3]
b[3]
c[3]
Since an example can be represented by both tuple and dict (
(a[i], b[i], c[i])
and{'a': a[i], 'b': b[i], 'c': c[i]}
), this class usesmode
to indicate which representation will be used. If there is only one column, an example also can be represented by a value (a[i]
). In this case,mode
isNone
.An inheritance should implement
__len__()
,keys
,mode
andget_examples()
.>>> import numpy as np >>> >>> from pytorch_pfn_extras import dataset >>> >>> class MyDataset(dataset.TabularDataset): ... ... def __len__(self): ... return 4 ... ... @property ... def keys(self): ... return ('a', 'b', 'c') ... ... @property ... def mode(self): ... return tuple ... ... def get_examples(self, indices, key_indices): ... data = np.arange(12).reshape((4, 3)) ... if indices is not None: ... data = data[indices] ... if key_indices is not None: ... data = data[:, list(key_indices)] ... return tuple(data.transpose()) ... >>> dataset = MyDataset() >>> len(dataset) 4 >>> dataset.keys ('a', 'b', 'c') >>> dataset.astuple()[0] (0, 1, 2) >>> sorted(dataset.asdict()[0].items()) [('a', 0), ('b', 1), ('c', 2)] >>> >>> view = dataset.slice[[3, 2], ('c', 0)] >>> len(view) 2 >>> view.keys ('c', 'a') >>> view.astuple()[1] (8, 6) >>> sorted(view.asdict()[1].items()) [('a', 6), ('c', 8)]
- __init__()¶
Methods
__init__
()asdict
()Return a view with dict mode.
astuple
()Return a view with tuple mode.
concat
(*datasets)Stack datasets along rows.
convert
(data)Convert fetched data.
fetch
()Fetch data.
get_example
(i)get_examples
(indices, key_indices)Return a part of data.
join
(*datasets)Stack datasets along columns.
transform
(keys, transform)Apply a transform to each example.
transform_batch
(keys, transform_batch)Apply a transform to examples.
with_converter
(converter)Override the behaviour of
convert()
.Attributes
keys
Names of columns.
mode
Mode of representation.
slice
Get a slice of dataset.