[ https://jira.fiware.org/browse/HELP-15764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fernando Lopez reassigned HELP-15764: ------------------------------------- Assignee: Fermín Galán > [fiware-stackoverflow] How to match column between HDFS File(save through cygnus) to Hive table? > ------------------------------------------------------------------------------------------------ > > Key: HELP-15764 > URL: https://jira.fiware.org/browse/HELP-15764 > Project: Help-Desk > Issue Type: Monitor > Components: FIWARE-TECH-HELP > Reporter: Backlog Manager > Assignee: Fermín Galán > Labels: fiware, fiware-cygnus, hdfs, hive > > Created question in FIWARE Q/A platform on 02-05-2019 at 10:05 > {color: red}Please, ANSWER this question AT{color} https://stackoverflow.com/questions/55948021/how-to-match-column-between-hdfs-filesave-through-cygnus-to-hive-table > +Question:+ > How to match column between HDFS File(save through cygnus) to Hive table? > +Description:+ > I want save some data in hive through fiware-cygnus, > I set the conf file, > cygnus-ngsi.sources = http-source > cygnus-ngsi.sinks = hdfs-sink > cygnus-ngsi.channels = hdfs-channel > cygnus-ngsi.sources.http-source.channels = hdfs-channel > cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource > cygnus-ngsi.sources.http-source.port = 5050 > cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler > cygnus-ngsi.sources.http-source.handler.notification_target = /notify > cygnus-ngsi.sources.http-source.handler.default_service = def_serv > cygnus-ngsi.sources.http-source.handler.default_service_path = /def_servpath > cygnus-ngsi.sources.http-source.handler.events_ttl = 2 > cygnus-ngsi.sources.http-source.interceptors = ts gi > cygnus-ngsi.sources.http-source.interceptors.ts.type = timestamp > cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder > cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /Applications/apache-flume-1.4.0-bin/conf/grouping_rules.confi > cygnus-ngsi.channels.hdfs-channel.type = memory > cygnus-ngsi.channels.hdfs-channel.capacity = 1000 > cygnus-ngsi.channels.hdfs-channel.transactionCapacity = 100 > cygnus-ngsi.sinks.hdfs-sink.type = com.telefonica.iot.cygnus.sinks.NGSIHDFSSink > cygnus-ngsi.sinks.hdfs-sink.channel = hdfs-channel > cygnus-ngsi.sinks.hdfs-sink.enable_encoding = true > cygnus-ngsi.sinks.hdfs-sink.enable_grouping = false > cygnus-ngsi.sinks.hdfs-sink.enable_lowercase = false > cygnus-ngsi.sinks.hdfs-sink.enable_name_mappings = false > cygnus-ngsi.sinks.hdfs-sink.data_model = dm-by-entity > cygnus-ngsi.sinks.hdfs-sink.file_format = json-row > cygnus-ngsi.sinks.hdfs-sink.backend.impl = rest > cygnus-ngsi.sinks.hdfs-sink.backend.max_conns = 500 > cygnus-ngsi.sinks.hdfs-sink.backend.max_conns_per_route = 100 > cygnus-ngsi.sinks.hdfs-sink.hdfs_host = [hdfs_host] > cygnus-ngsi.sinks.hdfs-sink.hdfs_port = 50070 > cygnus-ngsi.sinks.hdfs-sink.hdfs_username = [username] > cygnus-ngsi.sinks.hdfs-sink.hdfs_password = [password] > cygnus-ngsi.sinks.hdfs-sink.oauth2_token = [token] > cygnus-ngsi.sinks.hdfs-sink.service_as_namespace = false > cygnus-ngsi.sinks.hdfs-sink.batch_size = 100 > cygnus-ngsi.sinks.hdfs-sink.batch_timeout = 30 > cygnus-ngsi.sinks.hdfs-sink.batch_ttl = 10 > cygnus-ngsi.sinks.hdfs-sink.batch_retry_intervals = 5000 > cygnus-ngsi.sinks.hdfs-sink.hive = true > cygnus-ngsi.sinks.hdfs-sink.hive.server_version = 2 > cygnus-ngsi.sinks.hdfs-sink.hive.host = [hive_host] > cygnus-ngsi.sinks.hdfs-sink.hive.port = 10000 > cygnus-ngsi.sinks.hdfs-sink.hive.db_type = default-db > cygnus-ngsi.sinks.hdfs-sink.krb5_auth = false > and I found HDFS file created, > {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]} > {"recvx0054imex0054s":"1556776186","recvx0054ime":"2019-05-02T05:49:46.771Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]} > {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"lux","attrx0054ype":"Float","attrx0056alue":"7","attrx004dd":[]} > {"recvx0054imex0054s":"1556776989","recvx0054ime":"2019-05-02T06:03:09.962Z","fiwarex0053ervicex0050ath":"/LightInfo","entityx0049d":"bedroom3_light","entityx0054ype":"Light","attrx004eame":"status","attrx0054ype":"string","attrx0056alue":"off","attrx004dd":[]} > but, In hive, all data show null values. > hive> select * from ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row; > chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/ae8cd691-6e96-428d-8ece-02f6e6fbf527/hive_2019-05-02_16-37-44_821_8122085527913292696-1/-mr-10001': Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w---- > OK > ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054imex0054s ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.recvx0054ime ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.fiwarex0053ervicex0050ath ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0049d ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.entityx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004eame ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0054ype ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx0056alue ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row.attrx004dd > NULL NULL NULL NULL NULL NULL NULL NULL NULL > NULL NULL NULL NULL NULL NULL NULL NULL NULL > NULL NULL NULL NULL NULL NULL NULL NULL NULL > NULL NULL NULL NULL NULL NULL NULL NULL NULL > Time taken: 0.26 seconds, Fetched: 4 row(s) > External Table creating query in Cygnus Debug Console Message is, > create external table > if not exists default.ngsi_x004cightx002fx004cightx0049nfox002fbedroom_light_x004cight_row > (recvx0054imex0054s bigint, > recvx0054ime string, > fiwarex0053ervicex0050ath string, > entityx0049d string, > entityx0054ype string, > attrx004eame string, > attrx0054ype string, > attrx0056alue string, > attrx004dd array<struct<name:string,type:string,value:string>>) > row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true") > location '/user/ngsi/Light/LightInfo/bedroom_light_Light' > I guess the difference between hive table column name and HDFS File Attribute Name is the problem. > Hive table column name used encoding name, but hdfs file attribute name use not encoding name.(ex: recvTimeTs - recvx0054imex0054s) > so I created external table and tested. > create external table if not exists default.test01 (recvx0054imex0054s bigint, recvx0054ime string, fiwarex0053ervicex0050ath string, entityx0049d string, entityx0054ype string, attrx004eame string, attrx0054ype string, attrx0056alue string,attrx004dd array<struct<name:string,type:string,value:string>>) row format serde 'org.openx.data.jsonserde.JsonSerDe' with serdeproperties ("dots.in.keys" = "true") location '/user/ngsi/Light/LightInfo/bedroom_light_Light' > Result, show right values. > hive> select * from test01; > chmod: changing permissions of 'hdfs://hadoop01:9000/tmp/hive/hive/bec8d67d-7d94-4cdc-a577-14d990ca342c/hive_2019-05-02_16-42-59_290_896718721577759467-1/-mr-10001': Permission denied: user=hive, access=EXECUTE, inode="/tmp":hadoop:supergroup:drwx-w---- > OK > 1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light lux Float 7 [] > 1556776186 2019-05-02T05:49:46.771Z /LightInfo bedroom3_light Light status string off [] > 1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light lux Float 7 [] > 1556776989 2019-05-02T06:03:09.962Z /LightInfo bedroom3_light Light status string off [] > Time taken: 2.329 seconds, Fetched: 4 row(s) > How to match between Hive Table Column name and HDFS File Attribute name? > or How to use cygnus and hive other way? -- This message was sent by Atlassian JIRA (v6.4.1#64016)
You can get more information about our cookies and privacy policies clicking on the following links: Privacy policy Cookies policy