Vince's CSV Parser
csv::CSVReader Class Reference

Main class for parsing CSVs from files and in-memory sources. More...

#include <csv_reader.hpp>

Classes

class  iterator
 An input iterator capable of handling large files. More...
 

Public Member Functions

 CSVReader (const CSVReader &)=delete
 
 CSVReader (CSVReader &&)=default
 
CSVReaderoperator= (const CSVReader &)=delete
 
CSVReaderoperator= (CSVReader &&other)=default
 
Constructors

Constructors for iterating over large files and parsing in-memory sources.

 CSVReader (csv::string_view filename, CSVFormat format=CSVFormat::guess_csv())
 Reads an arbitrarily large CSV file using memory-mapped IO. More...
 
template<typename TStream , csv::enable_if_t< std::is_base_of< std::istream, TStream >::value, int > = 0>
 CSVReader (TStream &source, CSVFormat format=CSVFormat())
 Allows parsing stream sources such as std::stringstream or std::ifstream More...
 
Retrieving CSV Rows
bool read_row (CSVRow &row)
 Retrieve rows as CSVRow objects, returning true if more rows are available. More...
 
iterator begin ()
 Return an iterator to the first row in the reader.
 
HEDLEY_CONST iterator end () const noexcept
 A placeholder for the imaginary past the end row in a CSV. More...
 
bool eof () const noexcept
 Returns true if we have reached end of file.
 
CSV Metadata
CSVFormat get_format () const
 Return the format of the original raw CSV.
 
std::vector< std::string > get_col_names () const
 Return the CSV's column names as a vector of strings.
 
int index_of (csv::string_view col_name) const
 Return the index of the column name if found or csv::CSV_NOT_FOUND otherwise.
 
CSV Metadata: Attributes
CONSTEXPR bool empty () const noexcept
 Whether or not the file or stream contains valid CSV rows, not including the header. More...
 
CONSTEXPR size_t n_rows () const noexcept
 Retrieves the number of rows that have been read so far.
 
bool utf8_bom () const noexcept
 Whether or not CSV was prefixed with a UTF-8 bom.
 

Protected Member Functions

void set_col_names (const std::vector< std::string > &)
 Sets this reader's column names and associated data. More...
 
Multi-Threaded File Reading Functions
bool read_csv (size_t bytes=internals::ITERATION_CHUNK_SIZE)
 Read a chunk of CSV data. More...
 

Protected Attributes

CSV Settings
CSVFormat _format
 
Parser State
internals::ColNamesPtr col_names = std::make_shared<internals::ColNames>()
 Pointer to a object containing column information.
 
std::unique_ptr< internals::IBasicCSVParserparser = nullptr
 Helper class which actually does the parsing.
 
std::unique_ptr< RowCollectionrecords {new RowCollection(100)}
 Queue of parsed CSV rows.
 
size_t n_cols = 0
 The number of columns in this CSV.
 
size_t _n_rows = 0
 How many rows (minus header) have been read so far.
 

Detailed Description

Main class for parsing CSVs from files and in-memory sources.

All rows are compared to the column names for length consistency

  • By default, rows that are too short or too long are dropped
  • Custom behavior can be defined by overriding bad_row_handler in a subclass

Definition at line 57 of file csv_reader.hpp.

Constructor & Destructor Documentation

◆ CSVReader() [1/2]

csv::CSVReader::CSVReader ( csv::string_view  filename,
CSVFormat  format = CSVFormat::guess_csv() 
)

Reads an arbitrarily large CSV file using memory-mapped IO.

Details: Reads the first block of a CSV file synchronously to get information such as column names and delimiting character.

Parameters
[in]filenamePath to CSV file
[in]formatFormat of the CSV file

Guess delimiter and header row

Definition at line 154 of file csv_reader.cpp.

◆ CSVReader() [2/2]

template<typename TStream , csv::enable_if_t< std::is_base_of< std::istream, TStream >::value, int > = 0>
csv::CSVReader::CSVReader ( TStream &  source,
CSVFormat  format = CSVFormat() 
)
inline

Allows parsing stream sources such as std::stringstream or std::ifstream

Template Parameters
TStreamAn input stream deriving from std::istream
Note
Currently this constructor requires special CSV dialects to be manually specified.

Definition at line 120 of file csv_reader.hpp.

Member Function Documentation

◆ empty()

CONSTEXPR bool csv::CSVReader::empty ( ) const
inlinenoexcept

Whether or not the file or stream contains valid CSV rows, not including the header.

Note
Gives an accurate answer regardless of when it is called.

Definition at line 167 of file csv_reader.hpp.

◆ end()

HEDLEY_CONST CSVReader::iterator csv::CSVReader::end ( ) const
noexcept

A placeholder for the imaginary past the end row in a CSV.

Attempting to deference this will lead to bad things.

Definition at line 26 of file csv_reader_iterator.cpp.

◆ read_row()

bool csv::CSVReader::read_row ( CSVRow row)

Retrieve rows as CSVRow objects, returning true if more rows are available.

Performance Notes
Parameters
[out]rowThe variable where the parsed row will be stored
See also
CSVRow, CSVField

Example:

Definition at line 272 of file csv_reader.cpp.


The documentation for this class was generated from the following files: