Vince's CSV Parser
csv Namespace Reference

The all encompassing namespace. More...

Namespaces

 internals
 Stuff that is generally not of interest to end-users.
 

Classes

struct  CSVGuessResult
 Stores the inferred format of a CSV file. More...
 
class  CSVFormat
 Stores information about how to parse a CSV file. More...
 
class  CSVReader
 Main class for parsing CSVs from files and in-memory sources. More...
 
class  CSVField
 Data type representing individual CSV values. More...
 
class  CSVRow
 Data structure for representing CSV rows. More...
 
class  CSVStat
 Class for calculating statistics from CSV files and in-memory sources. More...
 
struct  CSVFileInfo
 Returned by get_file_info() More...
 
class  DelimWriter
 Class for writing delimiter separated values files. More...
 

Typedefs

using RowCollection = internals::ThreadSafeDeque< CSVRow >
 Standard type for storing collection of rows.
 
using string_view = nonstd::string_view
 The string_view class used by this library.
 
template<bool B, class T = void>
using enable_if_t = typename std::enable_if< B, T >::type
 

Enumerations

enum class  VariableColumnPolicy { THROW = -1 , IGNORE_ROW = 0 , KEEP = 1 }
 Determines how to handle rows that are shorter or longer than the majority.
 
enum class  DataType {
  UNKNOWN = -1 , CSV_NULL , CSV_STRING , CSV_INT8 ,
  CSV_INT16 , CSV_INT32 , CSV_INT64 , CSV_DOUBLE
}
 Enumerates the different CSV field types that are recognized by this library. More...
 

Functions

std::vector< std::string > get_col_names (csv::string_view filename, CSVFormat format)
 Return a CSV's column names. More...
 
CSVGuessResult guess_format (csv::string_view filename, const std::vector< char > &delims)
 Guess the delimiter used by a delimiter-separated values file.
 
 CSVRow::operator std::vector< std::string > () const
 
 HEDLEY_NON_NULL (2) CSVRow
 
template<>
std::string CSVField::get< std::string > ()
 Retrieve this field's original string.
 
template<>
CONSTEXPR_14 csv::string_view CSVField::get< csv::string_view > ()
 Retrieve a view over this field's string. More...
 
Utility Functions
std::unordered_map< std::string, DataTypecsv_data_types (const std::string &filename)
 Useful for uploading CSV files to SQL databases. More...
 
int get_col_pos (csv::string_view filename, csv::string_view col_name, const CSVFormat &format)
 Find the position of a column in a CSV file or CSV_NOT_FOUND otherwise. More...
 
CSVFileInfo get_file_info (const std::string &filename)
 Get basic information about a CSV file. More...
 
Shorthand Parsing Functions

Convienience functions for parsing small strings

CSVReader parse (csv::string_view in, CSVFormat format)
 Shorthand function for parsing an in-memory CSV string. More...
 
CSVReader parse_no_header (csv::string_view in)
 Parses a CSV string with no headers. More...
 
CSVReader operator""_csv (const char *in, size_t n)
 Parse a RFC 4180 CSV string, returning a collection of CSVRow objects. More...
 
CSVReader operator""_csv_no_header (const char *in, size_t n)
 A shorthand for csv::parse_no_header()
 

Variables

constexpr int CSV_NOT_FOUND = -1
 Integer indicating a requested column wasn't found.
 

CSV Writing

template<class OutputStream , bool Flush = true>
using CSVWriter = DelimWriter< OutputStream, ',', '"', Flush>
 An alias for csv::DelimWriter for writing standard CSV files. More...
 
template<class OutputStream , bool Flush = true>
using TSVWriter = DelimWriter< OutputStream, '\t', '"', Flush>
 Class for writing tab-separated values files. More...
 
template<class OutputStream >
CSVWriter< OutputStream > make_csv_writer (OutputStream &out, bool quote_minimal=true)
 Return a csv::CSVWriter over the output stream.
 
template<class OutputStream >
CSVWriter< OutputStream, false > make_csv_writer_buffered (OutputStream &out, bool quote_minimal=true)
 Return a buffered csv::CSVWriter over the output stream (does not auto flush)
 
template<class OutputStream >
TSVWriter< OutputStream > make_tsv_writer (OutputStream &out, bool quote_minimal=true)
 Return a csv::TSVWriter over the output stream.
 
template<class OutputStream >
TSVWriter< OutputStream, false > make_tsv_writer_buffered (OutputStream &out, bool quote_minimal=true)
 Return a buffered csv::TSVWriter over the output stream (does not auto flush)
 

Detailed Description

The all encompassing namespace.

Typedef Documentation

◆ CSVWriter

template<class OutputStream , bool Flush = true>
using csv::CSVWriter = typedef DelimWriter<OutputStream, ',', '"', Flush>

An alias for csv::DelimWriter for writing standard CSV files.

See also
csv::DelimWriter::operator<<()
Note
Use csv::make_csv_writer() to in instatiate this class over an actual output stream.

Definition at line 317 of file csv_writer.hpp.

◆ TSVWriter

template<class OutputStream , bool Flush = true>
using csv::TSVWriter = typedef DelimWriter<OutputStream, '\t', '"', Flush>

Class for writing tab-separated values files.

See also
csv::DelimWriter::write_row()
csv::DelimWriter::operator<<()
Note
Use csv::make_tsv_writer() to in instatiate this class over an actual output stream.

Definition at line 328 of file csv_writer.hpp.

Enumeration Type Documentation

◆ DataType

enum csv::DataType
strong

Enumerates the different CSV field types that are recognized by this library.

Note
Overflowing integers will be stored and classified as doubles.
Unlike previous releases, integer enums here are platform agnostic.
Enumerator
CSV_NULL 

Empty string.

CSV_STRING 

Non-numeric string.

CSV_INT8 

8-bit integer

CSV_INT16 

16-bit integer (short on MSVC/GCC)

CSV_INT32 

32-bit integer (int on MSVC/GCC)

CSV_INT64 

64-bit integer (long long on MSVC/GCC)

CSV_DOUBLE 

Floating point value.

Definition at line 20 of file data_type.h.

Function Documentation

◆ csv_data_types()

std::unordered_map< std::string, DataType > csv::csv_data_types ( const std::string &  filename)

Useful for uploading CSV files to SQL databases.

Return a data type for each column such that every value in a column can be converted to the corresponding data type without data loss.

Parameters
[in]filenameThe CSV file
Returns
A mapping of column names to csv::DataType enums

Definition at line 240 of file csv_stat.cpp.

◆ CSVField::get< csv::string_view >()

template<>
CONSTEXPR_14 csv::string_view csv::CSVField::get< csv::string_view > ( )

Retrieve a view over this field's string.

Warning
This string_view is only guaranteed to be valid as long as this CSVRow is still alive.

Definition at line 425 of file csv_row.hpp.

◆ get_col_names()

std::vector< std::string > csv::get_col_names ( csv::string_view  filename,
CSVFormat  format 
)

Return a CSV's column names.

Parameters
[in]filenamePath to CSV file
[in]formatFormat of the CSV file

Guess delimiter and header row

Definition at line 125 of file csv_reader.cpp.

◆ get_col_pos()

int csv::get_col_pos ( csv::string_view  filename,
csv::string_view  col_name,
const CSVFormat format 
)

Find the position of a column in a CSV file or CSV_NOT_FOUND otherwise.

Parameters
[in]filenamePath to CSV file
[in]col_nameColumn whose position we should resolve
[in]formatFormat of the CSV file

Definition at line 53 of file csv_utility.cpp.

◆ get_file_info()

CSVFileInfo csv::get_file_info ( const std::string &  filename)

Get basic information about a CSV file.

#include "csv.hpp"
#include <iostream>
int main(int argc, char** argv) {
using namespace csv;
if (argc < 2) {
std::cout << "Usage: " << argv[0] << " [file]" << std::endl;
exit(1);
}
std::string file = argv[1];
auto info = get_file_info(file);
std::cout << file << std::endl
<< "Columns: " << internals::format_row(info.col_names, ", ")
<< "Dimensions: " << info.n_rows << " rows x " << info.n_cols << " columns" << std::endl
<< "Delimiter: " << info.delim << std::endl;
return 0;
}
std::string format_row(const std::vector< std::string > &row, csv::string_view delim)
Definition: csv_reader.cpp:9
The all encompassing namespace.
CSVFileInfo get_file_info(const std::string &filename)
Get basic information about a CSV file.
Definition: csv_utility.cpp:64

Definition at line 64 of file csv_utility.cpp.

◆ operator""_csv()

CSVReader csv::operator""_csv ( const char *  in,
size_t  n 
)

Parse a RFC 4180 CSV string, returning a collection of CSVRow objects.

Example
TEST_CASE( "Test Escaped Comma", "[read_csv_comma]" ) {
auto rows = "A,B,C\r\n" // Header row
"123,\"234,345\",456\r\n"
"1,2,3\r\n"
"1,2,3"_csv;
CSVRow row;
rows.read_row(row);
REQUIRE( vector<string>(row) ==
vector<string>({"123", "234,345", "456"}));
}

Definition at line 37 of file csv_utility.cpp.

◆ parse()

CSVReader csv::parse ( csv::string_view  in,
CSVFormat  format 
)

Shorthand function for parsing an in-memory CSV string.

Returns
A collection of CSVRow objects
Example
TEST_CASE( "Test Escaped Quote", "[read_csv_quote]" ) {
// Per RFC 1480, escaped quotes should be doubled up
auto csv_string = GENERATE(as<std::string> {},
(
"A,B,C\r\n" // Header row
"123,\"234\"\"345\",456\r\n"
"123,\"234\"345\",456\r\n" // Unescaped single quote (not strictly valid)
"123,\"234\"345\",\"456\"" // Quoted field at the end
),
(
"\"A\",\"B\",\"C\"\r\n" // Header row
"123,\"234\"\"345\",456\r\n"
"123,\"234\"345\",456\r\n" // Unescaped single quote (not strictly valid)
"123,\"234\"345\",\"456\"" // Quoted field at the end
)
);
SECTION("Escaped Quote") {
auto rows = parse(csv_string);
REQUIRE(rows.get_col_names() == vector<string>({ "A", "B", "C" }));
// Expected Results: Double " is an escape for a single "
vector<string> correct_row = { "123", "234\"345", "456" };
for (auto& row : rows) {
REQUIRE(vector<string>(row) == correct_row);
}
}
}
CSVReader parse(csv::string_view in, CSVFormat format)
Shorthand function for parsing an in-memory CSV string.
Definition: csv_utility.cpp:14

Definition at line 14 of file csv_utility.cpp.

◆ parse_no_header()

CSVReader csv::parse_no_header ( csv::string_view  in)

Parses a CSV string with no headers.

Returns
A collection of CSVRow objects

Definition at line 23 of file csv_utility.cpp.