2017年8月16日星期三

人工智能 與 樓價預測(一)

近期人工智能因Alpha Go大熱,電腦棋手(Alpha Go)對 人類棋手(柯潔) 三局三勝,Alpha Go 通過不斷自我對弈和學習,成為圍棋大師。相信不夠將來人工智能將會大量應用於日常生活中。


 Machine Learning 能否應用於預測樓市的去向? 實際試一下便知,以下是我用的工具。
1) Ubuntu
2) iPython notebook
3) keras

第一步 數據收集
香港政府會將樓價/租金等資料放上網,並按樓宇的大小為5類
40                平分米以下為  A類
40至69.9     平分米為          B類
70至99.9     平分米為          C類
100至159.9 平分米為          D類
160              平分米以上為  E類

(每月樓價/租金等資料可於 http://www.rvd.gov.hk/doc/en/statistics/ 下載)

今次主要 用LSTM 預測 Class A(香港區)
Feature 為每月樓價和每月租金
Label 為 3 個月後的樓價

首先Import library
import os, pandas
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
import math, quandl, keras
import seaborn as sns
import numpy as np # linear algebra
from keras.optimizers import Adam
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error from pandas import read_csv


Set Parameter
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
look_back = 3      #Label 3 個月後
batch_size = 1     #Batch Size
np.random.seed(8)  #Random Seed
trainSplit=88      #Train 和TEST 分拆 比例
epochsValue=1000   #訓練次數




將 CSV import 去ARRAY
#Read Data
data=pd.read_csv("source/DomesticPrice.csv")
data["Year"] = data["Year"].astype(str)
data["Month"] = data["Month"].astype(str)
data["Date"]  = [dt.datetime.strptime(d,'%Y%m').date() for d in data["Date"].astype(str)]
data["Value"] = (data["Value"].replace(' ','')).astype(float)

rent=pd.read_csv("source/DomesticRent.csv")
rent["Year"] = rent["Year"].astype(str)
rent["Month"] = rent["Month"].astype(str)
#rent['Date'] = rent[['Year', 'Month']].apply(lambda x: ''.join(x), axis=1)
rent["Date"]  = [dt.datetime.strptime(d,'%Y%m').date() for d in rent["Date"].astype(str)]

data['Rent'] = rent['Value']

#data.dtypes
print('Average Price by Class')
print(data.head(),'\n')



Average Price by Class
   Year Month        Date Class      Place    Value  Rent
0  1999     1  1999-01-01     A  Hong Kong  42663.0   190
1  1999     2  1999-02-01     A  Hong Kong  43068.0   196
2  1999     3  1999-03-01     A  Hong Kong  42683.0   199
3  1999     4  1999-04-01     A  Hong Kong  43223.0   191
4  1999     5  1999-05-01     A  Hong Kong  43316.0   191 



#檢查是否有 NULL VALUE

#Verify any Null Value
print(data.isnull().any())


Year     False
Month    False
Date     False
Class    False
Place    False
Value    False
Rent     False


未做 Predict 前,先要了解你手上的數據

#Histogram for average price by Class from 1999 to 2016
%matplotlib inline
%pylab inline
pylab.rcParams['figure.figsize'] = (20, 20)

print('\n\nAverage price from Year 1999 to 2016  - 每平方米售價')
print(round(data.where(data.Year != '2017').dropna()['Value'].groupby([data['Class'], data['Place']]).describe(percentiles=[])))
sns.distplot(data.where(data.Year != '2017').dropna()['Value'])



樓價分佈
1) 1999 ~ 2016      不同Class 和區 的樓價  MEAN / STD  / MIN / MAX
     Class A(Hong Kong) 為例  Min 和 Max 相差 6倍, Mean(平均數) 為70,183

Average price from Year 1999 to 2016  - 每平方米售價
                       count      mean      std      min       50%       max
Class Place                                                                 
A     Hong Kong        216.0   70183.0  40555.0  23363.0   51771.0  151462.0
      Kowloon          216.0   54120.0  32292.0  19768.0   38178.0  124574.0
      New Territories  216.0   49493.0  27801.0  19724.0   36409.0  114705.0
B     Hong Kong        216.0   76817.0  38650.0  27661.0   61530.0  153461.0
      Kowloon          216.0   60796.0  33152.0  19834.0   47356.0  128291.0
      New Territories  216.0   47274.0  23222.0  20193.0   36118.0   99686.0
C     Hong Kong        216.0   96861.0  45127.0  36005.0   82718.0  181770.0
      Kowloon          216.0   83604.0  44138.0  23706.0   73494.0  166294.0
      New Territories  216.0   55656.0  23820.0  24439.0   45115.0  107064.0
D     Hong Kong        216.0  117856.0  54801.0  40724.0  107832.0  215879.0
      Kowloon          216.0  100424.0  50509.0  32407.0   90700.0  202067.0
      New Territories  216.0   59254.0  20859.0  25747.0   54932.0  106307.0
E     Hong Kong        216.0  158109.0  76408.0      0.0  145088.0  351027.0
      Kowloon          216.0  127965.0  77787.0      0.0  113680.0  557678.0
      New Territories  216.0   62025.0  20291.0  27461.0   62935.0  131290.0


1999至2016 樓價分佈圖









#Histogram for average price by Class from 2017 January to June
print('\n\nAverage Price from Year 2017 January - June  - 每平方米售價')
print(round(data.where(data.Year == '2017').dropna()['Value'].groupby([data['Class'], data['Place']]).describe(percentiles=[])))
sns.distplot(data.where(data.Year == '2017').dropna()['Value'])

2) 2017 1月至6月 樓價不同CLASS 和區的 MEAN/STD /MIN/MAX
2017 年 ClassA(Hong Kong)平均數為 154,314




Average Price from Year 2017 January - June  - 每平方米售價
                       count      mean       std       min       50%       max
Class Place                                                                   
A     Hong Kong          6.0  154314.0    6058.0  145930.0  154687.0  160842.0
      Kowloon            6.0  126572.0    4304.0  121875.0  127037.0  132952.0
      New Territories    6.0  119037.0    4179.0  112391.0  120082.0  123121.0
B     Hong Kong          6.0  158427.0    9004.0  149245.0  157392.0  173099.0
      Kowloon            6.0  127969.0    5543.0  118786.0  128000.0  135443.0
      New Territories    6.0  104152.0    3749.0   98651.0  105234.0  109003.0
C     Hong Kong          6.0  183396.0    5055.0  177600.0  183794.0  191130.0
      Kowloon            6.0  161993.0    9420.0  149225.0  164392.0  171789.0
      New Territories    6.0  108778.0    1432.0  107401.0  108334.0  111360.0
D     Hong Kong          6.0  210984.0   11912.0  191646.0  215463.0  220623.0
      Kowloon            6.0  169772.0    9997.0  155837.0  170111.0  183973.0
      New Territories    6.0  104136.0    8928.0   92255.0  104143.0  114054.0
E     Hong Kong          6.0  242705.0   28044.0  199533.0  252744.0  275942.0
      Kowloon            6.0  248105.0  146072.0       0.0  258817.0  438692.0
      New Territories    6.0   96008.0   12629.0   80038.0   94472.0  113930.0
2017年樓價分佈圖




租金分佈
#Histogram for rent by Class from 1999 to 2016
%matplotlib inline pylab.rcParams['figure.figsize'] = (20, 20) print('\n\nAverage rent from Year 1999 to 2016   - 每平方米租金') print(round(rent.where(data.Year != '2017').dropna()['Value'].groupby([rent['Class'], rent['Place']]).describe(percentiles=[]))) sns.distplot(rent.where(data.Year != '2017').dropna()['Value'])
1) 1999 ~ 2016      租金不同CLASS 和區的 MEAN/STD /MIN/MAX
Average rent from Year 1999 to 2016   - 每平方米租金
                       count   mean   std    min    50%    max
Class Place                                                   
A     Hong Kong        216.0  262.0  90.0  146.0  236.0  470.0
      Kowloon          216.0  202.0  69.0  118.0  176.0  367.0
      New Territories  216.0  159.0  58.0   87.0  134.0  287.0
B     Hong Kong        216.0  252.0  77.0  139.0  229.0  409.0
      Kowloon          216.0  198.0  64.0  112.0  166.0  334.0
      New Territories  216.0  144.0  47.0   83.0  121.0  249.0
C     Hong Kong        216.0  299.0  69.0  181.0  281.0  437.0
      Kowloon          216.0  237.0  65.0  139.0  220.0  400.0
      New Territories  216.0  164.0  45.0   91.0  147.0  261.0
D     Hong Kong        216.0  331.0  72.0  200.0  320.0  454.0
      Kowloon          216.0  249.0  59.0  146.0  241.0  379.0
      New Territories  216.0  203.0  44.0  114.0  192.0  321.0
E     Hong Kong        216.0  382.0  78.0  244.0  372.0  565.0
      Kowloon          216.0  243.0  81.0    0.0  228.0  488.0
      New Territories  216.0  211.0  50.0    0.0  210.0  343.0


#Histogram for average price by Class from 2017 January to June
print('Average rent from Year 2017 January - June   - 每平方米租金')
print(round(rent.where(rent.Year == '2017').dropna()['Value'].groupby([rent['Class'], rent['Place']]).describe(percentiles=[])))
sns.distplot(rent.where(rent.Year == '2017').dropna()['Value'])


2) 2017 1月至6月 租金不同CLASS 和區的 MEAN/STD /MIN/MAX
Average rent from Year 2017 January - June   - 每平方米租金
                       count   mean   std    min    50%    max
Class Place                                                   
A     Hong Kong          6.0  435.0  10.0  424.0  434.0  453.0
      Kowloon            6.0  342.0   9.0  331.0  340.0  357.0
      New Territories    6.0  285.0   6.0  278.0  284.0  293.0
B     Hong Kong          6.0  398.0   7.0  386.0  399.0  406.0
      Kowloon            6.0  324.0  10.0  314.0  320.0  336.0
      New Territories    6.0  248.0   9.0  240.0  244.0  262.0
C     Hong Kong          6.0  424.0   5.0  419.0  422.0  432.0
      Kowloon            6.0  353.0  14.0  334.0  355.0  366.0
      New Territories    6.0  256.0   8.0  248.0  254.0  266.0
D     Hong Kong          6.0  438.0   6.0  430.0  441.0  444.0
      Kowloon            6.0  343.0  17.0  322.0  342.0  366.0
      New Territories    6.0  248.0  11.0  235.0  244.0  267.0
E     Hong Kong          6.0  432.0  23.0  395.0  436.0  455.0
      Kowloon            6.0  381.0  53.0  331.0  364.0  465.0
      New Territories    6.0  234.0  10.0  221.0  233.0  251.0








沒有留言:

發佈留言