Hello again,
Thanks again for the tip,
As far as I understand, the Hdfs3SinkConnector should first be used to write the hdfs there in hadoop, then the Hdfs3Sourceconnector is used to read the written data in hadoop.
My problem is now with data writing with the producer consoler and I have data types like mp4 and xml in my hadoop.
I am researching how I write the mp4 data to hadoop through producer console: Most of what I have seen so far are just messages or self-written schema as shown below.
./bin/kafka-avro-console-producer --broker-list localhost: 9092 --topic parquet_field_hdfs
–property value.schema = '{“type”: “record”, “name”: “myrecord”, “fields”: [{“name”: “name”, “type”: “string”}, {" name “:” address “,” type “:” string “}, {” name “:” age “,” type “:” int “}, {” name “:” is_customer “,” type “:” boolean " }]} ’
paste each of these messages
{“name”: “Peter”, “address”: “Mountain View”, “age”: 27, “is_customer”: true}
{“name”: “David”, “address”: “Mountain View”, “age”: 37, “is_customer”: false}
{“name”: “Kat”, “address”: “Palo Alto”, “age”: 30, “is_customer”: true}
{“name”: “David”, “address”: “San Francisco”, “age”: 35, “is_customer”: false}
{“name”: “Leslie”, “address”: “San Jose”, “age”: 26, “is_customer”: true}
{“name”: “Dani”, “address”: “Seatle”, “age”: 32, “is_customer”: false}
{“name”: “Kim”, “address”: “San Jose”, “age”: 30, “is_customer”: true}
{“name”: “Steph”, “address”: “Seatle”, “age”: 31, “is_customer”: false}
or so :
./bin/kafka-avro-console-producer --broker-list localhost: 9092 --topic test_hdfs
–property value.schema = ‘{“type”: “record”, “name”: “myrecord”, “fields”: [{“name”: “f1”, “type”: “string”}]}’
paste each of these messages
{“f1”: “value1”}
{“f1”: “value2”}
{“f1”: “value3”}
I don’t know if you have experience with other data like mp4 or xml, which are also in my local
Thanks for everything already