๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

Apache NiFi

์ •ํ˜• ๋ฐ์ดํ„ฐ ์ ์žฌ ์—ฐ์Šตํ•˜๊ธฐ 01

 

 

 

 

๐Ÿ“– ์ •ํ˜• ๋ฐ์ดํ„ฐ ์ ์žฌ ์‹œ๋‚˜๋ฆฌ์˜ค 01

 

 

 


1. FTP ์„œ๋ฒ„์˜ /nifi_data/raw_dataset ๋””๋ ‰ํ„ฐ๋ฆฌ์—์„œ ์ด๋‹ˆ์…œ_business_data_20230213.csv ๋ฐ์ดํ„ฐํŒŒ์ผ์„ ๋กœ์ปฌ NiFi๋กœ ์ˆ˜์ง‘

[ FTP ์„œ๋ฒ„ ์—ฐ๊ฒฐ์ •๋ณด ]
HOST : Host๋ฒˆํ˜ธ
PORT : 21
ID : ์„œ๋ฒ„ID
PW : ์„œ๋ฒ„ PW

2. ๋ ˆ์ฝ”๋“œ ๊ฑด์ˆ˜ ์ธก์ •

3. ๋ ˆ์ฝ”๋“œ์˜ ์นผ๋Ÿผ๋ช…์„ ์„œ๋ฒ„ DB postgresql์˜ postgres DB nifi ์Šคํ‚ค๋งˆ ๋‚ด์˜ business_dist_data ํ…Œ์ด๋ธ”์— ๋งž๊ฒŒ ๋ณ€๊ฒฝ

4. ์„œ๋ฒ„DB postgresql DB nifi ์Šคํ‚ค๋งˆ ๋‚ด์˜ business_dist_data ํ…Œ์ด๋ธ”์— ๋ ˆ์ฝ”๋“œ ์ ์žฌ
[ DB ์—ฐ๊ฒฐ์ •๋ณด ]
 HOST : Host๋ฒˆํ˜ธ (์„œ๋ฒ„ Host์™€ ๊ฐ™์Œ)
 PORT : DB Port๋ฒˆํ˜ธ
 DB ๋ช… : postgres
 ID : postgres
 PW : 
 schema : nifi
 table name : business_dist_data

5. ์ ์žฌ ์™„๋ฃŒ๋œ ํŒŒ์ผ์„ FTP ์„œ๋ฒ„์˜ /nifi_data/result_dataset ๊ฒฝ๋กœ ํ•˜์œ„์— ๋ณธ์ธ์˜ ์ด๋‹ˆ์…œ๋กœ ํด๋”๋ช…์„ ์ง€์ •ํ•˜์—ฌ ํ•ด๋‹นํด๋”์— ๋ฐ์ดํ„ฐํŒŒ์ผ ์ ์žฌ

6. ์ ์žฌ ์™„๋ฃŒ ํ›„ ์›๋ž˜ ํŒŒ์ผ์ด ์žˆ๋˜ FTP ์„œ๋ฒ„์˜ /nifi_data/raw_dataset ์˜ ํŒŒ์ผ์„ /nifi_data/backup_dataset ํ•˜์œ„๋กœ ์˜ฎ๊ธฐ๊ธฐ

 

 

 

 

์ ์žฌ ์„ฑ๊ณต์‹œํ‚จ Flow

 

 

 

- GetFTP ์†์„ฑ ์„ค์ •

 

 

 

 

 

 

 

 

 

- QueryRecord ์†์„ฑ ์„ค์ •

 

 

SELECT
"์ƒ๊ฐ€์—…์†Œ๋ฒˆํ˜ธ" as house_no,
"์ƒํ˜ธ๋ช…" as cmpny_nm,
"์ง€์ ๋ช…" as point_nm,
"์ƒ๊ถŒ์—…์ข…๋Œ€๋ถ„๋ฅ˜์ฝ”๋“œ" as industry_lgcls_cd,
"์ƒ๊ถŒ์—…์ข…๋Œ€๋ถ„๋ฅ˜๋ช…" as industry_lgcls_nm,
"์ƒ๊ถŒ์—…์ข…์ค‘๋ถ„๋ฅ˜์ฝ”๋“œ" as industry_mdcls_cd,
"์ƒ๊ถŒ์—…์ข…์ค‘๋ถ„๋ฅ˜๋ช…" as industry_mdcls_nm,
"์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜์ฝ”๋“œ" as industry_smcls_cd,
"์ƒ๊ถŒ์—…์ข…์†Œ๋ถ„๋ฅ˜๋ช…" as industry_smcls_nm,
"ํ‘œ์ค€์‚ฐ์—…๋ถ„๋ฅ˜์ฝ”๋“œ" as std_cls_cd,
"ํ‘œ์ค€์‚ฐ์—…๋ถ„๋ฅ˜๋ช…" as std_cls_nm,
"์‹œ๋„์ฝ”๋“œ" as sd_cd,
"์‹œ๋„๋ช…" as sd_nm,
"์‹œ๊ตฐ๊ตฌ์ฝ”๋“œ" as sgg_cd,
"์‹œ๊ตฐ๊ตฌ๋ช…" as sgg_nm,
"ํ–‰์ •๋™์ฝ”๋“œ" as hd_cd,
"ํ–‰์ •๋™๋ช…" as hd_nm,
"๋ฒ•์ •๋™์ฝ”๋“œ" as bd_cd,
"๋ฒ•์ •๋™๋ช…" as bd_nm,
"์ง€๋ฒˆ์ฝ”๋“œ" as addr_cd,
"๋Œ€์ง€๊ตฌ๋ถ„์ฝ”๋“œ" as site_cd,
"๋Œ€์ง€๊ตฌ๋ถ„๋ช…" as site_se_nm,
"์ง€๋ฒˆ๋ณธ๋ฒˆ์ง€" as addr_main_no,
"์ง€๋ฒˆ๋ถ€๋ฒˆ์ง€" as addr_sub_no,
"์ง€๋ฒˆ์ฃผ์†Œ" as addr_data,
"๋„๋กœ๋ช…์ฝ”๋“œ" as road_cd,
"๋„๋กœ๋ช…" as road_nm,
"๊ฑด๋ฌผ๋ณธ๋ฒˆ์ง€" as bud_main_no,
"๊ฑด๋ฌผ๋ถ€๋ฒˆ์ง€" as bud_sub_no,
"๊ฑด๋ฌผ๊ด€๋ฆฌ๋ฒˆํ˜ธ" as bud_mng_no,
"๊ฑด๋ฌผ๋ช…" as bud_nm,
"๋„๋กœ๋ช…์ฃผ์†Œ" as bud_addr,
"๊ตฌ์šฐํŽธ๋ฒˆํ˜ธ" as old_zip_cd,
"์‹ ์šฐํŽธ๋ฒˆํ˜ธ" as new_zip_cd,
"๋™์ •๋ณด" as dong_info,
"์ธต์ •๋ณด" as flor_info,
"ํ˜ธ์ •๋ณด" as room_info,
"๊ฒฝ๋„" as logtd,
"์œ„๋„" as lattd
FROM flowfile

 

 

 

 

- CalculateRecordStats  ์†์„ฑ ์„ค์ •

 

 

 

 

 

 

- PutDatabaseRecord ์†์„ฑ ์„ค์ •

 

 

 

 

 

 

 

 

- PutFTP ์†์„ฑ ์„ค์ •

 

 

 

 

 

 

 

 

 

- FetchFTP ์†์„ฑ ์„ค์ •

 

 

Remote File : /nifi_data/result_dataset/boram/br_business_data_20230213.csv

๊ฒฝ๋กœ๋ฅผ ๊ตฌ์ฒด์ ์œผ๋กœ ์ ์–ด์ค€๋‹ค.

 

 

 

 

 

 

 

 

FTP ์„œ๋ฒ„์˜ /nifi_data/raw_dataset ๋””๋ ‰ํ„ฐ๋ฆฌ์—์„œ ์ด๋‹ˆ์…œ_business_data_20230213.csv ๋ฐ์ดํ„ฐํŒŒ์ผ์„ ๋กœ์ปฌ NiFi๋กœ ์ˆ˜์ง‘

 

 

 

 

 

 

 

 

๋ ˆ์ฝ”๋“œ์˜ ์นผ๋Ÿผ๋ช…์„ ์„œ๋ฒ„ DB postgresql์˜ postgres DB nifi ์Šคํ‚ค๋งˆ ๋‚ด์˜ business_dist_data ํ…Œ์ด๋ธ”์— ๋งž๊ฒŒ ๋ณ€๊ฒฝ

 

QueryRecord์—์„œ ์„ค์ •๋œ ํ˜•ํƒœ๋กœ  ์„œ๋ฒ„ DB postgresql DB nifi ์Šคํ‚ค๋งˆ ๋‚ด์˜ business_dist_data ํ…Œ์ด๋ธ”์— ๋ ˆ์ฝ”๋“œ ์ ์žฌ

 

 

 

 

 

 

 

 

์ ์žฌ ์™„๋ฃŒ๋œ ํŒŒ์ผ์„ FTP ์„œ๋ฒ„์˜ /nifi_data/result_dataset ๊ฒฝ๋กœ 

ํ•˜์œ„์— ๋ณธ์ธ์˜ ์ด๋‹ˆ์…œ๋กœ ํด๋”๋ช…์„ ์ง€์ •ํ•˜์—ฌ ํ•ด๋‹นํด๋”์— ๋ฐ์ดํ„ฐํŒŒ์ผ ์ ์žฌ

 

 

 

 

 

 

 

์ ์žฌ ์™„๋ฃŒ ํ›„ ์›๋ž˜ ํŒŒ์ผ์ด ์žˆ๋˜ FTP ์„œ๋ฒ„์˜ /nifi_data/raw_dataset ์˜ ํŒŒ์ผ์„ /nifi_data/backup_dataset ํ•˜์œ„๋กœ ์˜ฎ๊ธฐ๊ธฐ

→ ์ด ๋ถ€๋ถ„์€ window์—์„œ ๋ฐœ์ƒ๋˜๋Š” ๊ฒฝ๋กœ ์˜ค๋ฅ˜๋กœ ์„œ๋ฒ„์— ์˜ฎ๊ฒจ์ง€์ง€๋Š” ์•Š์•˜์ง€๋งŒ! ๊ณผ์ •๊ณผ ์†์„ฑ ์„ค์ •์€ ๋งž๊ฒŒ ์ž˜ ์„ค์ •๋œ ๊ฒƒ์ด๋‹ค