geometric2dr.data

geometric2dr.data.dortmund_formatter

Gexifier for TU Dortmund graph kernel based datasets.

class geometric2dr.data.dortmund_formatter.DortmundGexf(dataset, path_to_dataset, output_dir_for_graph_files)[source]

A class which reads TU Dortmund style datasets and processes them into a corresponding set of .gexf graphs and an associated .Labels file

This class helps turn datasets from the format with which TU Graph Kernel datasets are written into Gexf datasets, which Geo2DR can work with.

It reads the DS_A.txt, DS_graph_indicator.txt, and DS_graph_labels.txt to create a folder of graphs in GEXF format and a graph-id to graph-classification label file.

The saved format will be dataset_name/dataset_name/<name>.gexf : folder containing individual gexf files of each graph. dataset_name/dataset_name.Labels : a file denoting each gexf file to the integer class label

See tu_gexifier for a more basic script based version. This class version will also contain various metadata about the dataset which may be useful for downstream decomposition algorithms and other analysis

Parameters:
  • dataset (str) – string name of directory containing dataset, eg. “MUTAG”.
  • path_to_dataset (str) – path to directory containing directory of dataset.
  • output_dir_for_graph_files (str) – path to where new dataset and labels file will be saved.
format_dataset()[source]

Method which formats supplied TU-Dortmund formatted dataset into GEXF format compatible with other geometric2dr modules

Returns:The formatted dataset will be saved in output_dir_for_graph_files with the format described above
Return type:None