I’m trying to read the huggingface arrow files from libarrow in c++ and python. And I get:
Invalid: Not an Arrow file.
Python Code:
import pyarrow as pa
with open(‘glue-test.arrow’, ‘rb’) as f:
data = pa.ipc.open_file(f)
C++ Code:
std::shared_ptrarrow::io::ReadableFile infile;
ARROW_ASSIGN_OR_RAISE(infile, arrow::io::ReadableFile::Open(“data-00000-of-00001.arrow”, arrow::default_memory_pool()));
ARROW_ASSIGN_OR_RAISE(auto ipc_reader, arrow::ipc::RecordBatchFileReader::Open(infile));
And they both result:
Invalid: Not an Arrow file
I create the arrow files using:
from datasets import load_dataset
snil = load_dataset(‘snli’, split=‘train’)
snil.save_to_disk(“tempdata”)
Any Ideas will be appreciated … I’m really stuck at this thing.