Need help with bigdata_practice?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

TurboWay
183 Stars 30 Forks 9 Commits 0 Opened issues

Description

大数据分析可视化实践

Services available

!
?

Need anything else?

Contributors list

No Data

bigdata_practice

大数据实践项目 - nginx 日志分析可视化

功能说明

通过流、批两种方式,分析 nginx 日志,将分析结果通过 flask + echarts 进行可视化展示

数据收集分析过程

image-20201104093541868

方式一:离线批处理 hive + datax + mysql

方式二:实时流处理 flume + kafka + python + mysql

配置

  • 安装依赖
    pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt
    
  • 修改 ironman/datadb.py 的数据库配置 ```python ENGINECONFIG = 'mysql+pymysql://root:[email protected]:3306/test?charset=utf8' ```
  • mysql 建表 ``
    -- nginx_log 日志表
    create table fact_nginx_log(
    
    id
    int(11) NOT NULL AUTO_INCREMENT,
    
    remoteaddr
    VARCHAR(20),
    
    time
    local
    TIMESTAMP(0),
    
    province
    VARCHAR(20),
    
    request
    varchar(300),
    
    device
    varchar(50),
    
    os
    varchar(50),
    
    browser
    varchar(100),
    PRIMARY KEY (
    id`) ) DEFAULT CHARSET=utf8 ;

-- ip 地区映射表 create table dimip(

id
int(11) NOT NULL AUTOINCREMENT,

ip
VARCHAR(20),
province
VARCHAR(20),
addtime
TIMESTAMP(0) default now(), PRIMARY KEY (
id
) ) DEFAULT CHARSET=utf8 ; ```

运行

运行 cd ironman; python app.py

打开 http://127.0.0.1:5000/

效果图

24 小时访问趋势

image

每日访问情况

image

客户端设备占比

image

用户分布

image

爬虫词云

image

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.