Download datasets
TODO: UPDATE LATER.
GiDA-V1
Please go to Gida-V1, download the dataset, and place it into a folder, say /Dataset
.
Tutorial
For the first-time user, please refer to the datasets.py
script and review the GidaV6.__init__
function. A minimal example is also provided at the end of the script.
The data interface GidaV6
will take node (edge) attributes and output a set of records. Each records is a Data
instance (visit here for more information). This Data
contains a snapshot graph described by the (sparsed) adjacency matrix A, nodal feature X, and edge feature E. Also, if label is available, we have label Y corresponding to either node or edge. In the case both edge and node sets have their own labels, Y is for label of nodes, while E_Y stands for label of edges.
Assume you want to load the train set of Anytown network, a very simple interface can be declared as follow:
from gigantic_dataset.core.datasets import GidaV6
gida = GidaV6(
zip_file_paths=[
r"./Dataset/simgen_Anytown_20240524_1202_csvdir_20240527_1205.zip", # Anytown datset
],
node_attrs=[
"demand", # load nodal demand
],
edge_attrs=["pipe_diameter", "pipe_length"], # load some properites at edge
label_attrs=["pressure"], # expect labels Y are pressure
edge_label_attrs=["flowrate"], # expect edge labels E_Y are flowrate
split_set="train", # take train set only
num_records=100, # take only 100 records
selected_snapshots=None, # take all snapshots
)
# You can call a record directly
print(gida[0]) # Data instance
# Or via a data loader
from torch_geometric.loader import DataLoader
loader = DataLoader(gida, batch_size=1)
print(next(iter(loader))) #Batch instance