Hive table creation
Create table for import data with fields with CSVhive> create table Account(id int,name string,phone string) row format delimited fields terminated by ',';
Create table for secured account where PI column would be masked
hive> create table Accountmasked(id int,name string,phone string) row format delimited fields terminated by ',';
Create contact table
hive> create table contact(id int,accountid int,firstname string,lastName string, phone string,email string) row format delimited fields terminated by ',';
Create secured contact table where PI column would be masked
hive> create table securedcontact(id int,accountid int,firstname string, lastName string,phone string,email string) row format delimited fields terminated by ',';
MySQL table creation
Create account and contact table in mysql
create table accounts(id int not null,name varchar(255) not null,phone varchar(255) not null,primary key (id) ); create table contacts(id int not null,accountid int not null,firstname
varchar(255) not null, lastName varchar(255),phone varchar(100),email varchar(100),primary key (id));
MySQL Data ingestion
Import data from account.csv to accounts and contacts.csv to contacts table in mysql. Ignore the headersLOAD DATA LOCAL INFILE '/home/cloudera/workspace/Contact.csv'
into table contacts fields terminated by ','
IGNORE 1 LINES;
LOAD DATA LOCAL INFILE '/home/cloudera/workspace/Account.csv'
into table accounts fields terminated by ','
IGNORE 1 LINES;
Import data to hive from mysql contacts and account tables
sqoop import --connect jdbc:mysql://localhost:3306/securemask
--username root --password cloudera
--table contacts
--fields-terminated-by ","
--target-dir /user/cloudera/workspace/contactsmask
--hive-import
sqoop import --connect jdbc:mysql://localhost:3306/securemask
--username root --password cloudera
--table accounts --fields-terminated-by ","
--target-dir /user/cloudera/workspace/accountsmask
--hive-import
sqoop import --connect jdbc:mysql://localhost:3306/securemask
--username root --password cloudera
--table contacts
--fields-terminated-by ","
--target-dir /user/cloudera/workspace/contactsmask
--hive-import sqoop import --connect jdbc:mysql://localhost:3306/securemask
--username root --password cloudera
--table accounts --fields-terminated-by ","
--target-dir /user/cloudera/workspace/accountsmask
--hive-import
VALIDATION
Hive tables can be seen using hue
CREATE UDF function in Javapackage hiveudf; import org.apache.hadoop.hive.ql.exec.UDF; public class PImask extends UDF{ public String evaluate(String s) { int len = s.length(); char[] input = s.toCharArray(); if (len > 2) { for (int counter = 1; counter < len - 1; counter++) if (input[counter] != ' ') input[counter] = '*'; } else { for (int counter = 0; counter < len; counter++) { if (input[counter] != ' ') input[counter] = '*'; } } return String.valueOf(input); } public String evaluate(int i) { String IntToString = Integer.toString(i); int len = IntToString.length(); char[] input = IntToString.toCharArray(); for (int counter = 0; counter < len; counter++) input[counter] = '*'; String maskedIntString = String.valueOf(input); return (maskedIntString); } }
Add UDF function
hive> ADD JAR /home/cloudera/mask.jar;
hive> CREATE TEMPORARY FUNCTION MASK AS 'hiveudf.PImask';
hive> create table securedaccounts like accounts;
DATA MASKING
Insert masked phone numer into accounts
insert overwrite table securedaccounts select id,name,MASK(phone) from accounts;
Iinsert masked phone and email into contacts
insert overwrite table securedcontacts select id,accountid,firstname,
lastname,MASK(phone), MASK(email) from contacts;
Move tables (securedcontacts,securedaccounts) from hive to my SQL
sqoop export --connect jdbc:mysql://localhost:3306/securemask
--username root --password cloudera
--table contacts
--export-dir /user/hive/warehouse/securedcontacts sqoop export --connect jdbc:mysql://localhost:3306/securemask
--username root --password cloudera
--table accounts
--export-dir /user/hive/warehouse/securedaccounts
Note: I had to create a new table in mysql for import
Wow,great information. I am sure the info on your blog will help others,Thanks.
ReplyDeleteSpring Training in Chennai
Spring Training in Anna Nagar
Spring Training in T Nagar
Hibernate Training in Chennai
Hibernate Training
Spring Hibernate Training in Chennai
Spring Training in Porur
Spring Hibernate Training
Amazing post.Thanks for your details and explanations..I want more information from your side.Thank you
ReplyDeleteArticle submission sites
Technology
Awesome informations that you have shared for us.I eagerly waiting for more updates in future.
ReplyDeleteHadoop Training in Chennai
Big data training in chennai
Big Data Course in Chennai
JAVA Training in Chennai
Python Training in Chennai
Selenium Training in Chennai
Hadoop training in chennai
Big data training in chennai
big data course in chennai
Such an interesting blog,i gather more useful information...
ReplyDeleteAviation Courses in Chennai
Air hostess Training Institute in Bangalore
air hostess training fees in mumbai
air hostess training in chennai
Aviation courses in Bangalore
air hostess training in chennai
Air Hostess Training Institute in chennai
Aviation Courses in Chennai
aviation institute in bangalore
air hostess course in chennai
Thanks for sharing the information. I found the information very useful. That’s an awesome story you posted. I will come back to scan some more.
ReplyDelete