This class implements a Dataset to be used for a L-t-R task.
More...
#include <dataset.h>
|
virtual std::ostream & | put (std::ostream &os) const |
| Prints the data reading time stats. More...
|
|
This class implements a Dataset to be used for a L-t-R task.
The internal representation is quite simple: a row vector of size num_instances() x num_features(). (A training instance is indeed a document.) We allow to directly access the internal representation through the function at() to support fast access and custom high performance implementations. Internal representation is horizontal (instances x features).
quickrank::data::Dataset::Dataset |
( |
size_t |
n_instances, |
|
|
size_t |
n_features |
|
) |
| |
Allocates an empty Dataset of given size in horizontal format.
- Parameters
-
n_instances | The number of training instances (lines) in the dataset. |
n_features | The number of features. |
quickrank::data::Dataset::~Dataset |
( |
| ) |
|
|
virtual |
quickrank::data::Dataset::Dataset |
( |
const Dataset & |
other | ) |
|
|
delete |
Avoid inefficient copy constructor.
void quickrank::data::Dataset::addInstance |
( |
QueryID |
q_id, |
|
|
Label |
i_label, |
|
|
std::vector< Feature > |
i_features |
|
) |
| |
Add a new training instance, i.e., a labeled document, to the dataset.
- Warning
- Currently the addition works only when data is in HORIZ format.
- Parameters
-
q_id | The query ID. |
i_label | The relevance label of the result. |
i_features | The feature vector of the document. |
quickrank::Feature* quickrank::data::Dataset::at |
( |
size_t |
document_id, |
|
|
size_t |
feature_id |
|
) |
| |
|
inline |
Returns a pointer to a specific data item.
- Parameters
-
document_id | The document of interest. |
feature_id | The feature of interest. |
- Returns
- A reference to the requested feature value of the given document id.
Label quickrank::data::Dataset::getLabel |
( |
size_t |
document_id | ) |
|
|
inline |
Returns the value of the i-th relevance label.
std::unique_ptr< QueryResults > quickrank::data::Dataset::getQueryResults |
( |
size_t |
i | ) |
const |
Returns the i-th QueryResults in the dataset.
- Parameters
-
i | The i-th query results list of interest. |
- Returns
- The requested QueryResults.
size_t quickrank::data::Dataset::num_features |
( |
| ) |
const |
|
inline |
Returns the number of features used to represent a document.
size_t quickrank::data::Dataset::num_instances |
( |
| ) |
const |
|
inline |
Returns the number of documents in the dataset.
size_t quickrank::data::Dataset::num_queries |
( |
| ) |
const |
|
inline |
Returns the number of queries in the dataset.
size_t quickrank::data::Dataset::offset |
( |
size_t |
i | ) |
const |
|
inline |
Returns the offset in the internal data structure of the i-th query results list.
- Parameters
-
i | The i-th query results list of interest. |
- Returns
- The offset of the first document in the i-th query results list. This can be used to later invoke the at() function.
Avoid inefficient copy assignment.
std::ostream & quickrank::data::Dataset::put |
( |
std::ostream & |
os | ) |
const |
|
privatevirtual |
Prints the data reading time stats.
std::ostream& operator<< |
( |
std::ostream & |
os, |
|
|
const Dataset & |
me |
|
) |
| |
|
friend |
The output stream operator.
Prints the data reading time stats
size_t quickrank::data::Dataset::last_instance_id_ |
|
private |
size_t quickrank::data::Dataset::max_instances_ |
|
private |
size_t quickrank::data::Dataset::num_features_ |
|
private |
size_t quickrank::data::Dataset::num_instances_ |
|
private |
size_t quickrank::data::Dataset::num_queries_ |
|
private |
std::vector<size_t> quickrank::data::Dataset::offsets_ |
|
private |
The documentation for this class was generated from the following files: