Saving the graph requires us to implement a graph writer class comprising of two functions: save_vertex() and save_edge().

 class graph_writer {
   public:
      std::string save_vertex(graph_type::vertex_type v) { return ""; }
      std::string save_edge(graph_type::edge_type e) { return ""; }
 };

The save_vertex() and save_edge() functions are are respective called on each vertex/edge in the graph. These functions return a string which is then directly written to the output file.

For instance, to save an output file comprising of [webpage] [pagerank] lines, we may implement the following:

 class graph_writer {
   public:
      std::string save_vertex(graph_type::vertex_type v) {
        std::stringstream strm;
        // remember the \n at the end! This will provide a line break
        // after each page.
        strm << v.data().pagename << "\t" << v.data().pagerank << "\n";
        return strm.str();
      }
      std::string save_edge(graph_type::edge_type e) { return ""; }
 };

Since we are not interested in the edges, the save_edge() function simply returns an empty string.

Note:: The stringstream is somewhat slow and is not the fastest way to write a string. Performance gains can be made through the use of C string operations.

Then to write the graph, we will call

 graph.save("output",
            graph_writer(),
            false, // set to true if each output file is to be gzipped
            true, // whether vertices are saved
            false); // whether edges are saved

This will save a sequence of files named output_1_of_N, output_2_of_N ... where N is some integer. Concatenating all the files together will produce the combined output. If the gzip option is set, each of the files will have a .gz suffix and gunzip must be used to decompres the file for reading.

If the output path is located on HDFS, for instance:

hdfs:///namenode/data/output

The result will be saved to the HDFS cluster with the given namename, in the subdirectory /data with the filenames output_1_of_N, output_2_of_N ... .

There are several other "built-in" saving formats which can be accessed through the ref graphlab::distributed_graph::save_format() "graph.save_format()"function.

The next section is a brief conclusion.