data_science

by lampts

daily curated links in DS, DL, NLP, ML

126 Stars 33 Forks Last release: Not found 2.0K Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

data_science

seeing is believing. A witty saying proves nothing.

"When solving a problem of interest, do not solve a more general problem as an intermediate step." (Vladimir Vapnik)

Must read

  • foundation of dl: https://www.youtube.com/watch?time_continue=157&v=zl99IZvW7rE
  • (Bradley)Bayesian, Frequentist and Scientist: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.179.1454&rep=rep1&type=pdf
  • (Breiman) 2 cultures http://www2.math.uu.se/~thulin/mm/breiman.pdf
  • https://gluebenchmark.com/leaderboard

My implementations

  • nb
  • lr
  • trees: https://stats.stackexchange.com/questions/231220/how-to-compute-the-gradient-and-hessian-of-logarithmic-loss-question-is-based

Chatbot

  • https://github.com/chiphuyen/stanford-tensorflow-tutorials/tree/master/assignments/chatbot
  • https://botlist.co/
  • https://github.com/JStumpp/awesome-chatbots
  • https://github.com/fendouai/Awesome-Chatbot
  • https://github.com/dennybritz/chatbot-retrieval/
  • https://realpython.com/python-keras-text-classification/
  • https://github.com/ekapolc/nlp_course/blob/master/slides/L10.2-chatbotsOverview.pdf

RecSys

  • https://github.com/maciejkula/spotlight
  • session based: https://arxiv.org/pdf/1511.06939.pdf
  • pool next item: https://www.semanticscholar.org/paper/Deep-Neural-Networks-for-YouTube-Recommendations-Covington-Adams
  • tune nlp: http://ruder.io/deep-learning-nlp-best-practices/index.html#classification

Winining solutions

  • http://ndres.me/kaggle-past-solutions/
  • Rossmann Sales Forecasting, 1st solution: https://kaggle2.blob.core.windows.net/forum-message-attachments/102102/3454/Rossmannnr1doc.pdf

Stats

  • Good, Hardin. Common Errors in Statistics (and How to Avoid Them) (2003)
  • Kanji. 100 statistical tests (2006)
  • Doing Data Science: Straight Talk from the Frontline

Game Industry:

  • https://project.dke.maastrichtuniversity.nl/cig2018/proceedings/
  • https://www.slideshare.net/africaperianez/game-data-science-the-state-of-the-art

Case stydies:

  • auc, https://www.kaggle.com/c/acquire-valued-shoppers-challenge#evaluation
  • auc, https://www.kaggle.com/c/kdd-cup-2014-predicting-excitement-at-donors-choose#description

DS Coursera

  • http://www.chioka.in/how-to-select-your-final-models-in-a-kaggle-competitio/
  • http://scikit-learn.org/stable/modules/cross_validation.html

Heroes of DL

  • Geoffrey Hinton: https://www.youtube.com/watch?v=-eyhCTvrEtE
  • Andreij Karpathy: https://www.youtube.com/watch?v=_au3yw46lcg

Top conferences:

  • KDD 2018 London, UK: http://www3.imperial.ac.uk/newsandeventspggrp/imperialcollege/engineering/datascienceinstitute/newssummary/news_22-8-2017-11-17-28
  • WSDM 2018, US: http://www.wsdm-conference.org/2018/
  • NIPS 2017, Long Beach, US: https://nips.cc/
  • DepLing 2017: http://www.depling.org/depling2017/program.html
  • CIKM 2017: http://cikm2017.org/
  • https://webdocs.cs.ualberta.ca/~zaiane/htmldocs/ConfRanking.html
  • http://www.guide2research.com/topconf/machine-learning
  • http://portal.core.edu.au/conf-ranks/?search=&by=all&source=CORE2017&sort=atitle&page=1

Deep Learning

  • http://www.deeplearningbook.org/contents/applications.html

Events: I will put word cloud for that.

EMNLP 2017: http://noisy-text.github.io/2017/

NLPStan reading

  • http://nlp.stanford.edu/read/
  • NLP dataset: https://github.com/niderhoff/nlp-datasets

LXMLS16:

  • http://lxmls.it.pt/2016/Deep-Neural-Networks-Are-Our-Friends.pdf
  • http://lxmls.it.pt/2016/lxmls-dl2.pdf

ACL2017

  • keynote: linguistic is back, reduce search space: https://drive.google.com/file/d/0B2cCJQ2_aOwjMlg5MnFjTEpBNG8/view

VietAI

  • Quoc Le (Google Brain): http://cs.stanford.edu/~quocle/
  • Thang Luong (Google Brain): http://t.co/3zNHouUn
  • Dustin (Columbia) http://dustintran.com/
  • Thien (NYU) http://www.cs.nyu.edu/~thien/
  • Hieu Pham (CMU) https://www.quora.com/profile/Hieu-Pham-20
  • Ken Tran (Microsofts) http://www.kentran.net/
  • Laurent Dinh (MILA):https://laurent-dinh.github.io/about/
  • Luong Hoang, Harvard: https://github.com/lhoang29/recurrent-entity-networks
  • Vu Pham

My SOTA

  • My ATIS: sequence tagging, nb of params: 324335, bi-LSTM
  • Quore question duplicate detection: Accuracy 85% on Wang's test
 - best F1 score: 94.92/94.64
 - train scores: 97.5446666667/96.17
 - val scores: 93.664/92.94

Game industry

  • TCCP PU learning https://arxiv.org/pdf/1802.09788.pdf
  • By last time login: https://mpra.ub.uni-muenchen.de/82871/1/paper8.pdf
  • https://www.slideshare.net/aistconf/webgames-61437118

Yandex

  • https://github.com/ddtm/dl-course
  • https://github.com/vkantor/MIPTDataMiningInAction_2016/tree/master/trends
  • https://github.com/yandexdataschool/Practical_RL
  • https://github.com/yandexdataschool/HSE_deeplearning

ICLR 2017 Review

  • if you wanna turn LSTM, it's worth to read (from Socher): https://arxiv.org/pdf/1611.05104v2.pdf

LearningNewThingIn2017

  • Torch/Lua (Facebook/HarvardNLP): http://nlp.seas.harvard.edu/code/, http://cs287.fas.harvard.edu/
  • TF/Python (Google/Stanford): https://github.com/BinRoot/TensorFlow-Book
  • cs287: https://github.com/CS287/Lectures

Conf events

  • Coling 2016, Osaka Japan: http://coling2016.anlp.jp/
  • ICLR 2017, Apr in France: http://www.iclr.cc/doku.php?id=ICLR2017:main&redirect=1
  • open review: http://openreview.net/group?id=ICLR.cc/2017/conference

NIPs 2016 slides

  • https://github.com/hindupuravinash/nips2016
  • Ian GAN tut: http://www.iangoodfellow.com/slides/2016-12-9-gans.pdf
  • Ng nuts and bolts: https://www.dropbox.com/s/dyjdq1prjbs8pmc/NIPS2016%20-%20Pages%202-6%20(1).pdf
  • variational inference: http://www.cs.columbia.edu/~blei/talks/2016NIPSVI_tutorial.pdf

Theano based DL applications

  • https://news.ycombinator.com/item?id=9283105

learn to learn: algos optimization

  • sgd and friends: http://cs231n.github.io/neural-networks-3/#update
  • overview of gd: http://sebastianruder.com/optimizing-gradient-descent/
  • https://github.com/fchollet/keras/issues/898
  • I used to choose adam and rmsprop with tuning lr and batch size.

People

  • http://people.stat.sc.edu/haigang/techBlog.html
  • http://aejjrsite.free.fr/goodmorning/gm122/gm122_ThayToiMauriceAllais.pdf
  • http://www.thesaigontimes.vn/271832/cau-chuyen-tri-tue-nhan-tao.html

Pin:

  • semantic scholar: https://www.semanticscholar.org/
  • grow a mind: http://web.mit.edu/cocosci/Papers/tkgg-science11-reprint.pdf
  • trendingarxiv: http://trendingarxiv.smerity.com/
  • https://github.com/andrewt3000/DL4NLP
  • Natural languague inference NLI: https://github.com/Smerity/keras_snli
  • ACL: http://www.aclweb.org/anthology/P/P16/

Data type: NOQ

  • Nominal (N):cat, dog --> x,o | vis: shape, color
  • Ordinal (O): Jan - Feb - Mar - Apr | vis: area, density
  • Quantitative (Q): numerical 0.42, 0.58 | vis: length, position

People:

  • Graham CMU: http://www.phontron.com/teaching.php, https://github.com/neubig/

Fin data:

  • Reuters 8M (2007-2016): https://github.com/philipperemy/Reuters-full-data-set.git
  • Bloomberg https://github.com/philipperemy/financial-news-dataset
  • stocktwits: https://github.com/goodwillyoga/E107project/tree/master/pooja/data

Projects:

  • https://github.com/THEdavehogue/glassdoor-analysis

Wikidata:

  • https://github.com/VladimirAlexiev/VladimirAlexiev.github.io/blob/master/CH-names/README.org
  • https://github.com/VladimirAlexiev/VladimirAlexiev.github.io/tree/master/CH-names

Cartoons & Quotes:

  • "cause you know sometimes words have two meanings" led zeppelin
  • http://stats.stackexchange.com/questions/423/what-is-your-favorite-data-analysis-cartoon?newsletter=1&nlcode=231076%7C1179

Books:

  • http://neuralnetworksanddeeplearning.com/index.html
  • u.cs.biu.ac.il/~yogo/nnlp.pdf

Done:

  1. EMNLP 2016, Austin, 2-4 Nov: http://www.emnlp2016.net/tutorials.html#practical
  • Dynet (CMU: https://t.co/nSCkBt0i0F
  • lifelong ML (Google): http://www.emnlp2016.net/tutorials/chen-liu-t3.pdf
  • Markov logic for scalable joint inference: http://www.emnlp2016.net/tutorials/venugopal-gogate-ng-t2.pdf
  • good summary of sentiment analysis with NN (Singapore): http://www.emnlp2016.net/tutorials/zhang-vo-t4.pdf
  • structure prediction (POS, NER)(Singapore): http://www.emnlp2016.net/tutorials/sun-feng-t6.pdf

  • BADLS: 2 day conference at Stanford university

day 1:

  • Hugo(Twitter): Feed forward NN
  • Kartpathy(OpenAI): Convnet
  • Socher(MetaMind): NLP = word2vec/glove + GRU + MemNet
  • Tensorflow tut: from 5:55:49
  • Ruslan: Deep Unsup Learning: from 7:10:39
  • Andrew Ng: Nuts and bolts in applied DL from 9:09:46

day 2:

  • Schulman: RL from 06:40
  • Pascal(MILA): theano, from 1:52:03
  • ASR from 4:01:11
  • NN with Torch from 5:49:32, https://github.com/alexbw/bayarea-dl-summerschool
  • seq2seq learning, Quoc Le: from 7:03:44
  • Bengio: Foundations and challenges in DL, from 9:01:14

  • data fest: https://alexanderdyakonov.wordpress.com/

  • 8,9,12,13 Sept: data science week: http://dsw2016.datascienceweek.com/

  • KDD 2016: http://www.kdd.org/kdd2016/

  • ACL 2016, Berlin, 7-12 Aug: http://acl2016.org/index.php?article_id=60

AI mistakes:

  • napalm girl: https://techcrunch.com/2016/09/12/facebook-employees-say-deleting-napalm-girl-photo-was-a-mistake/
  • fine for his car shadow: http://www.independent.co.uk/news/world/europe/russian-driver-fined-car-shadow-moscow-a7225146.html
  • human on motorcycle: http://cs.stanford.edu/people/karpathy/deepimagesent/generationdemo/

Keras:

  • image classification with vgg16: http://www.pyimagesearch.com/2016/08/10/imagenet-classification-with-python-and-keras/
  • hualos, keras viz: https://github.com/fchollet/hualos
  • https://github.com/dylandrover/kerastutorial/blob/master/kerastutorial/keras_deck.pdf
  • https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/widendeep_tutorial.py
  • model zoo:https://github.com/tensorflow/models
  • music auto tag: https://github.com/keunwoochoi/music-auto_tagging-keras
  • expose API: https://github.com/samjabrahams/inception-resnet-flask-demo

NLP:

  • https://github.com/attardi/deepnl
  • https://github.com/biplab-iitb/practNLPTools
  • http://ml.nec-labs.com/senna/
  • LSTM + CNN char on NER: https://transacl.org/ojs/index.php/tacl/article/viewFile/792/202
  • https://metamind.io/research/the-wikitext-long-term-dependency-language-modeling-dataset/

Apps:

  • https://github.com/fginter/w2v_demo
  • http://bionlp-www.utu.fi/wv_demo/
  • 3top: https://github.com/3Top/word2vec-api
  • next wave of nn: http://www.nextplatform.com/2016/09/14/next-wave-deep-learning-applications/
  • labeling tools: http://cs.stanford.edu/people/karpathy/ilsvrc/
  • deep art: https://deepart.io/hire/kzXhuUPf/
  • text sum: http://esapi.intellexer.com/Summarizer
  • http://www.deeplearningpatterns.com/doku.php/applications
  • mt: http://104.131.78.120/
  • rnn: http://www.cs.toronto.edu/~ilya/fourth.cgi?prefix=I+have+a+dream.+&numChars=150
  • chatbot: http://sumve.com/firesidechat/
  • text vis: http://slanglab.cs.umass.edu/topic-animator/
  • music auto tag: https://github.com/keunwoochoi/music-auto_tagging-keras
  • deep image sent: http://cs.stanford.edu/people/karpathy/deepimagesent/rankingdemo/

German word embedding:

  • pretrained: http://devmount.github.io/GermanWordEmbeddings/
  • vis: pca, tsne: https://github.com/devmount/GermanWordEmbeddings/blob/master/code/pca.ipynb

PyGotham:

  • textacy: http://michelleful.github.io/code-blog/2016/07/23/nlp-at-pygotham-2016/
  • nlp with keras, rnn, cnn
  • https://github.com/drincruz/PyGotham-2016
  • skipthought: https://libraries.io/github/LeavesBreathe/Sequence-To-Sequence-Generation-Skip-Thoughts-
  • https://github.com/ryankiros/skip-thoughts
  • doc sum: http://mike.place/talks/pygotham/#p1

Journalist LDA and ML:

  • http://knightlab.northwestern.edu/2015/03/10/nicar-2015-machine-learning-lessons-for-journalists/
  • summary on hanna wallach https://docs.google.com/document/d/1kIIzBAF9T9Zu99i0DU9akIajvYZ-CfHeBFVBhIJyEY8/edit?pref=2&pli=1
  • http://www.cs.ubc.ca/~murphyk/MLbook/pml-toc-22may12.pdf
  • http://slides.com/stevenrich/machine-learning#/18
  • https://github.com/cjdd3b/nicar2015/tree/master/machine-learning
  • https://github.com/cjdd3b/fec-standardizer

Europython:

  • http://kjamistan.com/i-hate-you-nlp/
  • https://github.com/adewes/machine-learning-chinese
  • https://github.com/GaelVaroquaux/my_topics
  • https://github.com/arnicas/nlpelasticsearchreviews

Scipy 2016:

  • http://scipy2016.scipy.org/ehome/146062/332963/

Performance Evaluation(PE):

  • book ELA: http://www.cambridge.org/us/academic/subjects/computer-science/pattern-recognition-and-machine-learning/evaluating-learning-algorithms-classification-perspective
  • slides: http://www.icmla-conference.org/icmla11/PE_Tutorial.pdf
  • bayesian hypothesis testing: http://ipg.idsia.ch/preprints/corani2015c.pdf

Hypothesis testing

  • http://bebi103.caltech.edu/2015/tutorials/t6bfrequentisthypothesis_testing.html
  • central limit theorem: http://nbviewer.jupyter.org/github/mbakker7/exploratorycomputingwithpython/blob/master/notebooks3/pyexpcomps3sol.ipynb
  • hypothesis testing and p value: http://vietsciences.free.fr/khaocuu/nguyenvantuan/bieudoR/ch7-kiemdinhgiathiet.htm

Metrics:

  • http://users.dsic.upv.es/~dpinto/duc/RougeLin.pdf

Rock, Metal and NLP:

  • http://www.deepmetal.io/
  • https://github.com/ijmbarr/metal_models
  • http://www.degeneratestate.org/posts/2016/Sep/12/heavy-metal-and-natural-language-processing-part-2/
  • http://www.degeneratestate.org/posts/2016/Apr/20/heavy-metal-and-natural-language-processing-part-1/

Financial:

  • https://github.com/johnymontana/NewzTraderAIproject

Twitter:

  • http://nlp.stanford.edu/projects/glove/preprocess-twitter.rb
  • GATE NER dataset: https://gate.ac.uk/wiki/broad-twitter-corpus.html

Deep Learning Frameworks/Toolkits:

  • Tensorflow
  • Torch
  • Theano
  • Keras
  • Dynet
  • CNTK

ElasticSearch + Kibana:

  • install ES 2.4 + Kibana: default sense in console 5601
  • http://ghostweather.slides.com/lynncherny/deck

Attention based:

  • code RWA in TF: https://github.com/jostmey/rwa
  • decomposable attention: https://github.com/explosion/spaCy/tree/master/examples/kerasparikhentailment
  • customized lstm with attention: http://benjaminbolte.com/blog/2016/keras-language-modeling.html
  • vis + cnn + lstm: https://blog.heuritech.com/2016/01/20/attention-mechanism/

ResNet: Residual Networks

  • http://yanran.li/peppypapers/2016/01/10/highway-networks-and-deep-residual-networks.html
  • how deep Vgg 16,19 vs 152 200 layers: https://www.reddit.com/r/MachineLearning/comments/4cmcfs/howcanresnetcnngodeepto152layersand200/
  • http://www.slideshare.net/Textkernel/practical-deep-learning-for-nlp

Sentiment

  • dataset: 1.6M: https://docs.google.com/uc?id=0B04GJPshIjmPRnZManQwWEdTZjg&export=download
  • quandl: https://github.com/kszela24/options-daily
  • stocktwit: http://stocktwits.com/symbol/FINL
  • https://github.com/jssandh2/StockSearchEngine
  • https://www.quantopian.com/posts/crowd-sourced-stock-sentiment-using-stocktwits
  • https://www.crowdflower.com/data-for-everyone/

NER

  • https://github.com/aleju/ner-crf
  • 2017 conference: http://noisy-text.github.io/2017/
  • demo: http://nlp.stanford.edu:8080/ner/process
  • ritter: https://www.cise.ufl.edu/class/cis6930fa11lad/cis6930fa11_NEROverTweets.pdf
  • cmu tweetnlp: http://www.cs.cmu.edu/~ark/TweetNLP/
  • opencalais: http://www.opencalais.com/opencalais-demo/
  • https://www.quora.com/How-can-I-find-city-country-company-name-from-a-tweet-text-using-Java
  • no broad domain, average accuracy 80-85% is quite good: https://www.quora.com/How-accurate-are-entity-extraction-tools
  • http://blog.districtdatalabs.com/named-entity-recognition-and-classification-for-entity-extraction
  • http://noisy-text.github.io/2016/ner-shared-task.html
  • https://noisy-text.github.io/2016/pdf/WNUT26.pdf
  • dataset: https://www.dropbox.com/s/yaoy7zi9vz71nki/wnutnerevaluation.tgz?dl=0
  • wnut solution: https://github.com/napsternxg/TwitterNER
  • dataset wnut16: https://github.com/aritter/twitter_nlp/tree/master/data/annotated/wnut16/data

ML Stacking

  • brew: https://github.com/viisar/brew
  • heamy: https://github.com/rushter/heamy

Tensorflow tutorials

  • https://github.com/alrojo/tensorflow-tutorial
  • https://github.com/farizrahman4u/keras-contrib

Covariate shift

  • https://www.quora.com/What-is-Covariate-shift
  • https://blog.bigml.com/2013/11/01/machine-learning-next/
  • https://blog.bigml.com/2013/03/12/machine-learning-from-streaming-data-two-problems-two-solutions-two-concerns-and-two-lessons/

PydataLondon2017

  • https://pydata.org/london2017/schedule/presentation/12/
  • https://pydata.org/london2017/schedule/presentation/20/
  • https://pydata.org/london2017/schedule/presentation/34/
  • https://pydata.org/london2017/schedule/presentation/17/
  • https://pydata.org/london2017/schedule/presentation/47/
  • https://pydata.org/london2017/schedule/presentation/16/
  • https://pydata.org/london2017/schedule/presentation/52/
  • https://pydata.org/london2017/schedule/presentation/22/
  • https://pydata.org/london2017/schedule/presentation/30/
  • https://pydata.org/london2017/schedule/presentation/23/
  • https://pydata.org/london2017/schedule/presentation/69/

NLP course

  • https://www.cs.bgu.ac.il/~elhadad/nlp17.html

Dataset

  • CONLL2003: https://github.com/kuruonur1/char-tag
  • https://files.pushshift.io/reddit/comments/

Tricks of DL

  • https://engineering.purdue.edu/~qobi/papers/ad2016d.pdf
  • practical DL: http://www.deeplearningbook.org/slides/11_practical.pdf
  • tuning cnn: http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html
  • https://github.com/Conchylicultor/Deep-Learning-Tricks
  • https://cs224d.stanford.edu/lectures/CS224d-Lecture6.pdf
  • http://karpathy.github.io/neuralnets/
  • http://www.deeplearningbook.org/slides/11_practical.pdf

Pointer network

  • http://fastml.com/introduction-to-pointer-networks/
  • keras: https://github.com/zygmuntz/pointer-networks-experiments
  • https://arxiv.org/pdf/1511.06391v4.pdf
  • https://www.slideshare.net/KeonKim/attention-mechanisms-with-tensorflow

Attention

  • https://arxiv.org/abs/1707.00110

Log likelihood test

  • tool http://ucrel.lancs.ac.uk/llwizard.html
  • significance testing of word frequency in corpora: https://users.ics.aalto.fi/lijffijt/articles/lijffijt2015a.pdf
  • TA and TM for social: https://de.dariah.eu/tatom/
  • http://sappingattention.blogspot.com/2011/10/comparing-corpuses-by-word-use.html#comments
  • http://sappingattention.blogspot.com/2011/11/dunning-amok.html
  • https://tedunderwood.com/2011/11/09/identifying-the-terms-that-characterize-an-author-or-genre-why-dunnings-may-not-be-the-best-method/

MLtrainings.ru

  • quora presentation: https://gh.mltrainings.ru/presentations/SkornyakovKaggleQuora2017.pdf
  • hearthstone: https://gh.mltrainings.ru/presentations/PatekhaHearthstone2017.pdf

GCloud

  • http://www.albertauyeung.com/post/setup-jupyter-nginx-supervisor/
  • https://medium.com/google-cloud/running-jupyter-notebooks-on-gpu-on-google-cloud-d44f57d22dbd

Current conference

  • http://sigir.org/sigir2017/
  • icml: https://2017.icml.cc/
  • emnlp: http://emnlp2017.net/

https://github.com/aymericdamien/TensorFlow-Examples

Timeline

  • kaggle in Russian: https://boosters.pro/champs
  • https://github.com/mariazm/Spring2017ProfFosterProvost/tree/master/Module8Unsupervised_MLreview
  • https://github.com/johnpateha/mlhacks/blob/master/djexplore_algoparameters.ipynb

WSDM 2019

  • https://sites.google.com/view/wsdm19-fairness-tutorial
  • https://causalinference.gitlab.io/wsdm-tutorial/
  • https://sites.google.com/view/wsdm19-privacy-tutorial
  • https://www.slideshare.net/TetsuyaSakai/wsdm2019tutorial
  • https://arxiv.org/abs/1808.05163
  • https://www.google.com/maps/@-37.8067424,144.9921405,13z/data=!3m1!4b1!4m3!11m2!2sn0Jgpeo5HjLhS61R5hCfiUgIaOhuHQ!3e3

Computer Vision

  • http://slazebni.cs.illinois.edu/spring17/
  • https://skymind.ai/wiki/convolutional-network
  • https://medium.com/@jonathan_hui/what-do-we-learn-from-single-shot-object-detectors-ssd-yolo-fpn-focal-loss-3888677c5f4d
  • https://towardsdatascience.com/faster-r-cnn-object-detection-implemented-by-keras-for-custom-data-from-googles-open-images-125f62b9141a
  • https://medium.com/@jonathan_hui/design-choices-lessons-learned-and-trends-for-object-detections-4f48b59ec5ff
  • https://medium.com/@jonathan_hui/what-do-we-learn-from-single-shot-object-detectors-ssd-yolo-fpn-focal-loss-3888677c5f4d
  • https://github.com/akTwelve/tutorials/blob/master/maskrcnn/MaskRCNNTrainAndInference.ipynb
  • https://github.com/RockyXu66/FasterRCNNforOpenImagesDatasetKeras/blob/master/frcnntrainvgg.ipynb
  • https://ujjwalkarn.me/2016/08/11/intuitive-explanation-convnets/
  • https://www.superdatascience.com/blogs/the-ultimate-guide-to-convolutional-neural-networks-cnn
  • https://skymind.ai/wiki/convolutional-network
  • https://yerevann.com/a-guide-to-deep-learning/
  • https://towardsdatascience.com/faster-r-cnn-object-detection-implemented-by-keras-for-custom-data-from-googles-open-images-125f62b9141a
  • https://towardsdatascience.com/facial-keypoint-detection-detect-relevant-features-of-face-in-a-go-using-cnn-your-own-dataset-e09cf359c2bc

ICCV 2019

07.10

  • https://stackoverflow.com/questions/42307949/color-theme-for-vs-code-integrated-terminal/46166487
  • https://github.com/zhulingchen/tfp-tutorial
  • tf2 keras for researcher: https://colab.research.google.com/drive/1UCJt8EYjlzCs1H1d1X0iDGYJsHKwu-NO
  • visualizing outliers in big data: https://www.cs.uic.edu/~wilkinson/Publications/outliers.pdf

13.06

  • https://github.com/tmbdev/ocropy
  • https://github.com/keras-team/keras/blob/master/examples/image_ocr.py

04.06

  • https://storage.googleapis.com/openimages/web/challenge.html

18.05

  • https://www.slideshare.net/HITCONGIRLS/ithome-2019-ai-turkeymelodypdf-138370023

17.05

  • https://handong1587.github.io/deep_learning/2015/10/09/object-detection.html
  • https://github.com/experiencor/keras-yolo3
  • https://github.com/Adamdad/keras-YOLOv3-mobilenet
  • https://arxiv.org/pdf/1805.02283.pdf
  • https://github.com/seasonSH/DocFace/tree/master/src
  • https://habr.com/ru/company/avito/blog/452142/
  • https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d

14.05

  • https://medium.com/@zhanwenchen/install-cuda-10-1-and-cudnn-7-5-0-for-pytorch-on-ubuntu-18-04-lts-9b6124c44cc
  • https://stackoverflow.com/questions/43214346/split-queue-into-train-test-set

13.05

  • https://www.sites.google.com/site/yorkyuhuang/home/tutorial/deep-learning-1/objectdetectiontrackingrecognition-with-deep-learning
  • deepsystems ctc loss: https://www.youtube.com/watch?v=eYIL4TMAeRI
  • https://github.com/linkedin/TonY/blob/master/tony-examples/tony-in-gcp/scripts/installgpucu10.sh

08.05

  • https://github.com/cydonia999/TinyFacesin_Tensorflow/blob/master/README.md

07.05

  • https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow-2326a3487cd5
  • https://github.com/Hyperparticle/one-pixel-attack-keras
  • https://github.com/sozykin/dlpythoncourse/blob/master/computervision/fotocomparison/fotoverification.ipynb
  • http://mostafadehghani.com/2019/05/05/universal-transformers/
  • https://blogs.dropbox.com/tech/2017/04/creating-a-modern-ocr-pipeline-using-computer-vision-and-deep-learning/

03.05

  • tf.dataset https://www.youtube.com/watch?v=kVEOCfBy9uY&feature=youtu.be
  • https://github.com/SkalskiP/ILearnDeepLearning.py/blob/master/01mysteriesofneuralnetworks/06numpyconvolutionalneuralnetINPROGRESS/Building%20convolutional%20neural%20network%20in%20Numpy.ipynb

28.04

  • https://assessingpsyche.wordpress.com/2014/06/04/using-the-truncated-normal-distribution/

24.04

  • https://medium.com/@philosophygeek/selling-data-products-is-the-wrong-business-model-for-ai-startups-300835c4eb92
  • https://simplystatistics.org/2019/04/17/tukey-design-thinking-and-better-questions/
  • https://www.youtube.com/watch?v=qFtJaq4TlqE&feature=youtu.be
  • https://medium.com/nybles/create-your-first-image-recognition-classifier-using-cnn-keras-and-tensorflow-backend-6eaab98d14dd

19.04

  • https://github.com/jingw222/tf2-serving-w-docker/blob/master/servingwdocker.ipynb
  • https://medium.com/@jingw222/tensorflow-serving-with-docker-an-end-to-end-example-24b412e31ae1
  • https://bitbucket.org/pbcquoc/ocr/src/64e6eb1d0e63?at=master
  • https://speakerdeck.com/alexkimxyz/monitoring-ml-applications-in-production

10.04

  • http://d2l.ai/chapter_introduction/intro.html#A-Motivating-Example
  • pytorch tuts 10K+: https://github.com/yunjey/pytorch-tutorial
  • tf v2 tuts: https://github.com/aymericdamien/TensorFlow-Examples/tree/master/tensorflow_v2
  • catboost lecture: https://compscicenter.ru/media/courses/2018-spring/spb-machine-learning-2/slides/machinelearning2lecture260218.pdf
  • https://habr.com/ru/post/447376/

09.04

  • https://towardsdatascience.com/which-deep-learning-framework-is-growing-fastest-3f77f14aa318
  • https://threader.app/thread/1105139360226140160

08.04

  • https://hurenjun.github.io/
  • beam search: https://www.coursera.org/lecture/nlp-sequence-models/beam-search-4EtHZ
  • joint embedding for transportation: https://hurenjun.github.io/pubs/aaai2019-slides.pdf
  • embedding for anomaly detection: https://hurenjun.github.io/pubs/icde2016-slides.pdf

05.04

  • https://towardsdatascience.com/3-ways-to-load-csv-files-into-colab-7c14fcbdcb92
  • https://github.com/kaiwaehner/python-jupyter-apache-kafka-ksql-tensorflow-keras
  • https://www.kaggle.com/mlg-ulb/creditcardfraud/kernels
  • https://jakevdp.github.io/blog/2015/07/23/learning-seattles-work-habits-from-bicycle-counts/

03.04

  • https://slides.com/vladimiriglovikov/title-texttitle-text-17#/0/25
  • http://gameaibook.org/book.pdf
  • https://services.google.com/fh/files/blogs/insightsforevaluatinglifetimevalueforgame_developers.pdf

01.04

  • https://berkeley-deep-learning.github.io/cs294-131-s19/
  • https://www.technologyreview.com/s/613170/emtech-digital-dawn-song-adversarial-machine-learning/

31.03

  • https://blog.ml.cmu.edu/2019/03/29/building-machine-learning-models-via-comparisons/
  • https://pmbaumgartner.github.io/notebooks/colored-roc-curves/
  • http://dsd.future-lab.cn/members/2015nlp/readings/rWISAIsHallofFame.pdf
  • http://dsd.future-lab.cn/members/2015nlp/nature482.pdf

30.03

  • https://www.usenix.org/conference/enigma2017/conference-program/presentation/evans
  • http://web.stanford.edu/class/cs224n/index.html#coursework
  • https://towardsdatascience.com/time-series-nested-cross-validation-76adba623eb9
  • https://www.usenix.org/conference/enigma2017/conference-program/presentation/evans
  • https://www.datasciencecentral.com/profiles/blogs/fee-book-applied-stochastic-processes

29.03

  • https://github.com/yenchenlin/awesome-adversarial-machine-learning

28.03

  • https://www.analyticsvidhya.com/blog/2018/07/introductory-guide-maximum-likelihood-estimation-case-study-r/
  • https://scholarspace.manoa.hawaii.edu/bitstream/10125/50002/1/paper0115.pdf
  • https://medium.freecodecamp.org/keras-vs-pytorch-avp-transfer-learning-c8b852c31f02

21.03

  • https://www.nguyenvantuan.info/research-blog/the-blind-faith-in-the-p-values-should-be-stopped

20.03

  • https://drive.google.com/file/d/1idTS63oXT1jBUNm_qH9fke0VDGihM7ir/view
  • gan https://github.com/Dyakonov/DL/blob/master/AMDDL09gan17.pdf

14.03

  • https://www.math3ma.com/blog/matrices-probability-graphs
  • https://explained.ai/rf-importance/index.html
  • http://julian.togelius.com/Drachen2013Game.pdf

11.03

  • https://www.youtube.com/watch?v=s3VmuVPfu0s
  • https://www.youtube.com/watch?v=1cRGpDXTJC8&t=638s

07.03

  • https://medium.com/tensorflow/recap-of-the-2019-tensorflow-dev-summit-1b5ede42da8d
  • http://gltr.io/dist/index.html
  • https://www.amazon.com/Analytics-Descriptive-Predictive-Network-Techniques/dp/1119133122

06.03

  • https://www.youtube.com/channel/UCZqlZbg9EzwRnLqhFQumQ/featured?app=desktop
  • https://www.slideshare.net/albedan/kaggle-days-paris-alberto-danese-ml-interpretability
  • xgboost from 0: https://www.youtube.com/watch?v=0hxX4XAf2DA
  • kdd2016 recsys ctr field awared https://www.youtube.com/watch?v=1cRGpDXTJC8

01.03

  • https://github.com/lexfridman/mit-deep-learning
  • https://en.wikipedia.org/wiki/Nofreelunch_theorem
  • https://en.wikipedia.org/wiki/Sunk_cost
  • https://en.wikipedia.org/wiki/Reinforcement_learning
  • https://dyakonov.org/2019/02/21/%D0%BD%D0%B5%D0%BC%D0%B0%D1%82%D0%B5%D0%BC%D0%B0%D1%82%D0%B8%D0%BA%D0%B0-%D0%B2-%D0%B0%D0%BD%D0%B0%D0%BB%D0%B8%D0%B7%D0%B5-%D0%B4%D0%B0%D0%BD%D0%BD%D1%8B%D1%85/

21.02

  • https://boosters.pro/championships
  • machine learns physic laws. https://arxiv.org/abs/1807.10300
  • https://istina.msu.ru/media/publications/article/972/9eb/7537819/sw-factors-dyakonov.pdf

20.02

  • https://github.com/Microsoft/Recommenders
  • https://blog.openai.com/better-language-models/#content

19.02

  • http://deliprao.com/archives/314
  • https://console.cloud.google.com/storage/browser/commonsense-reasoning/reproduce/stories_corpus?pli=1

13.02

  • https://github.com/omarsar/nlphighlights/blob/master/NLP2018_Highlights.pdf
  • https://hbr.org/2019/02/companies-are-failing-in-their-efforts-to-become-data-driven
  • https://www.nytimes.com/2019/02/05/business/media/artificial-intelligence-journalism-robots.html

12.02

  • https://github.com/google/sentencepiece
  • https://www.youtube.com/watch?v=0EtD5ybnh_s
  • https://aws.agorize.com/en/challenges/vietnam-2019
  • https://nlp.stanford.edu/seminar/details/jdevlin.pdf
  • https://www.lyrn.ai/2019/02/11/xlm-cross-lingual-language-model/

11.02

  • https://github.com/noveens/svae_cf
  • https://www.kaggle.com/artgor/how-to-not-overfit

09.02

  • https://github.com/DiligentPanda/TencentAdsAlgo_2018
  • https://github.com/Dyakonov/mlhacks/blob/master/djMLDM_kernels.ipynb

03.02

  • https://stats.stackexchange.com/questions/312780/why-is-accuracy-not-the-best-measure-for-assessing-classification-models
  • http://iranarze.ir/wp-content/uploads/2016/10/E2281.pdf

24.01

  • https://causalinference.gitlab.io/kdd-tutorial/

21.01

  • http://www.econ.upf.edu/~michael/stanford/maeb6.pdf
  • http://www.econ.upf.edu/~michael/stanford/maeb4.pdf
  • http://www.econ.upf.edu/~michael/stanford/maeb5.pdf

18.01

  • kids learn and acquire language using statistic learning. Chomsky school. https://www.youtube.com/watch?v=uSFPgDuyv6E
  • bootstrap with pitfalls: https://arxiv.org/pdf/1411.5279.pdf
  • categorial data analysis: https://www.youtube.com/watch?v=FCrYGuO8CmU
  • humbio: https://www.ted.com/talks/robertsapolskythebiologyofourbestandworst_selves?language=en

16.01

  • https://machinelearningforkids.co.uk/
  • www.quantamagazine.org/been-kim-is-building-a-translator-for-artificial-intelligence-20190110
  • https://ai.googleblog.com/2019/01/looking-back-at-googles-research.html
  • hbr.org/2019/01/data-science-and-the-art-of-persuasion

14.01

  • https://www.datasciencecentral.com/m/blogpost?id=6448529:BlogPost:791619
  • https://www.datasciencecentral.com/profiles/blogs/how-to-choose-fraud-detection-software-features-characteristics
  • https://learnk8s.io/blog/scaling-machine-learning-with-kubeflow-tensorflow
  • https://yanirseroussi.com/2019/01/08/hackers-beware-bootstrap-sampling-may-be-harmful/
  • https://medium.com/acing-ai/capital-one-data-science-interview-questions-b6263d8a3af6
  • https://github.com/FunctorML/BellkorAlgorithm
  • https://blogs.mathworks.com/loren/2015/04/22/the-netflix-prize-and-production-machine-learning-systems-an-insider-look/

03.01

  • https://github.com/Erlemar/digit-draw-recognize
  • https://medium.com/analytics-and-data/on-customer-lifetime-value-in-ecommerce-d3c151c6fdc0
  • http://blog.kaggle.com/2017/01/23/a-kaggle-master-explains-gradient-boosting/
  • http://mattturck.com/bigdata2018

02.01

  • startup genome: https://s3.amazonaws.com/startupcompass-public/StartupGenomeReport2WhyStartupsFailv2.pdf
  • https://www.amazon.com/gp/product/0470650931
  • https://peadarcoyle.com/2019/01/01/think-you-need-to-learn-bayesian-analysis-read-this-first/
  • https://inst.eecs.berkeley.edu/~cs188/fa18/
  • https://www.jmp.com/en_us/academic/data-mining-techniques.html
  • birthday effect: https://dyakonov.org/2016/11/28/%D0%B4%D0%B5%D0%BD%D1%8C-%D0%BD%D0%B0%D1%88%D0%B5%D0%B9-%D1%81%D0%BC%D0%B5%D1%80%D1%82%D0%B8/

===== GOODBYE 2018

29.12

  • https://preferred.ai/category/education/
  • http://www.ousia.jp/en/page/en/2017/02/20/wsdm-cup/
  • https://medium.com/@bryan.gregory1/predicting-customer-churn-extreme-gradient-boosting-with-temporal-data-332c0d9f32bf

25.12

  • https://github.com/mwburke/population-stability-index/blob/master/walkthrough-example.ipynb

22.12

  • https://qiita.com/namakemono/items/f9574fe0a6b7ebb91e73
  • https://github.com/ShuaiW/kaggle-classification/
  • http://www.chioka.in/kaggle-competition-solutions/
  • https://github.com/Far0n/kaggletils
  • https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/
  • https://github.com/rushter/heamy
  • https://scikit-learn.org/stable/autoexamples/ensemble/plotfeature_transformation.html
  • https://research.fb.com/publications/practical-lessons-from-predicting-clicks-on-ads-at-facebook/
  • https://github.com/iamtodor/data-science-interview-questions-and-answers

20.12

  • https://www.dgsiegel.net/talks/the-bullet-hole-misconception
  • https://www.analyticsvidhya.com/blog/2016/09/40-interview-questions-asked-at-startups-in-machine-learning-data-science/
  • https://github.com/jessevig/bertviz

19.12

  • https://static.googleusercontent.com/media/research.google.com/en//bigpicture/MLVisualizationNeurIPS_Tutorial.pdf
  • https://www.facebook.com/nipsfoundation/videos/203530960558001/
  • https://www.slideshare.net/BryanGregory2/kaggle-wsdm-2018-winning-solution-predicting-customer-churn-xgboost-with-temporal-data-87662268

18.12

  • http://www.ruiyan.me/pubs/tutorial-emnlp18.pdf
  • http://newsletter.ruder.io/issues/neurips-2018-the-nature-of-research-advances-in-image-generation-protein-folding-and-rl-144756

17.12

  • https://github.com/facebookresearch/pytext
  • nlu https://purl.stanford.edu/gd576xb1833

12.12

  • https://towardsdatascience.com/in-browser-object-detection-using-yolo-and-tensorflow-js-d2a2b7429f7c
  • https://github.com/SkalskiP/ILearnMachineLearning.py

10.12

  • https://github.com/tensorflow/models/tree/master/research/cvt_text
  • https://lilianweng.github.io/lil-log/
  • https://github.com/zalandoresearch/flair

09.12

-https://hai.stanford.edu/news/theintertwinedquestforunderstandingbiologicalintelligenceandcreatingartificialintelligence/ - https://medium.com/@kcimc/how-to-recognize-fake-ai-generated-images-4d1f6f9a2842 - https://machinethoughts.wordpress.com/2017/09/01/deep-meaning-beyond-thought-vectors/ - https://arxiv.org/pdf/1809.04559.pdf

07.12

  • https://www.stateoftheart.ai/?area=Computer%20Vision
  • https://adversarial-ml-tutorial.org/
  • https://www.kaggle.com/kernels.json?sortBy=hotness&group=everyone&pageSize=200

06.12

  • https://github.com/tensorflow/ranking

04.12

  • https://machinelearning.apple.com/2018/12/03/optimizing-siri-on-homepod-in-far-field-settings.html
  • https://seeing-theory.brown.edu/basic-probability/index.html
  • https://jalammar.github.io/illustrated-bert/

02.12

  • https://medium.com/@kt.era.ee/the-data-science-workflow-43859db0415
  • https://colab.research.google.com/drive/1lEu7qNBMSIm2g7YfBhgug7IAB6Rw4b5E

01.12

  • https://github.com/zhpmatrix/zhpmatrix.github.io/blob/master/cellar/DiveintoXGBoost.pdf

29.11

  • http://visualcommonsense.com/#anexample
  • https://supernlp.github.io/2018/11/26/sentreps/

26.11

  • transform net for target sentiment analysis: https://ai.tencent.com/ailab/media/publications/acl/TransformationNetworksforTarget-OrientedSentiment_Classification.pdf
  • https://lixin4ever.github.io/paper/ACL2018/slides/acl18lixinslides.pdf

BERT with <3

  • https://github.com/facebookresearch/XNLI
  • https://hanxiao.github.io/2018/06/24/4-Encoding-Blocks-You-Need-to-Know-Besides-LSTM-RNN-in-Tensorflow/
  • https://github.com/google-research/bert#pre-trained-models
  • https://github.com/hanxiao/bert-as-service#q-what-is-the-parallel-processing-model-behind-the-scene

20.11

  • https://ai.tencent.com/ailab/TransformationNetworksforTarget-OrientedSentiment_Classification.html
  • https://github.com/alicezheng/feature-engineering-book

15.11

  • https://medium.com/analytics-vidhya/python-libraries-for-data-science-other-than-pandas-and-numpy-95da30568fad
  • https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/CIKM14tutorialHeGaoDeng.pdf

14.11

  • vietnamese ner: https://github.com/duongna21/VNsequencelabeling
  • pzad data preprocessing: https://github.com/Dyakonov/PZAD/blob/master/PZAD201809datapreprocessing_15.pdf
  • https://medium.com/acing-ai/what-is-hidden-in-the-hidden-markov-models-eee7bab45ac3

13.11

  • http://ruder.io/optimizing-gradient-descent/
  • dont decay lr, double your batch size: https://arxiv.org/abs/1711.00489

12.11

  • https://www.youtube.com/watch?v=uvH1zB7qahI
  • https://www.youtube.com/watch?v=6n-kCYn0zxU
  • https://github.com/Featuretools/predicting-customer-churn/blob/master/churn/4.%20Feature%20Engineering%20on%20Spark.ipynb
  • https://demo.ipavlov.ai/
  • https://towardsdatascience.com/how-to-create-value-with-machine-learning-eb09585b332e
  • https://github.com/Featuretools/predicting-customer-churn
  • https://arxiv.org/pdf/1810.09591.pdf

10.11

  • deep learning in airbnb search: https://arxiv.org/pdf/1810.09591.pdf
  • https://www.youtube.com/watch?v=FmKU-1LZGoE

08.11

  • https://github.com/hse-aml
  • http://web.stanford.edu/class/cs20si/lectures/slides_13.pdf
  • https://www.kaggle.com/sudalairajkumar/a-look-at-different-embeddings
  • https://github.com/pjankiewicz/mercari-solution/blob/master/mercari/transformers.py

07.11

  • https://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
  • https://towardsdatascience.com/my-weaknesses-as-a-data-scientist-1310dab9f566
  • https://jhu-advdatasci.github.io/2018/lectures/12-being-skeptical.html
  • https://chauff.github.io/2018-11-04-emnlp/

06.11

  • https://github.com/Hvass-Labs/TensorFlow-Tutorials

04.11

  • http://seq2seq-vis.io/
  • https://www.tensorflow.org/tutorials/

01.11

  • http://yowconference.com.au/slides/yowdata2017/Hougland-SparkMLWorkflows.pdf
  • https://github.com/rjurney/AgileDataCode_2

29.10

  • https://towardsdatascience.com/understanding-feature-engineering-part-1-continuous-numeric-data-da4e47099a7b

25.10

  • https://ingoscholtes.github.io/kdd2018-tutorial/

23.10

  • https://www.kdnuggets.com/2018/05/deep-learning-apache-spark-part-2.html/2
  • https://www.kaggle.com/c/pzadbabki/discussion
  • https://www.kaggle.com/c/pubg-finish-placement-prediction

18.10

  • https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf
  • https://github.com/hse-aml/natural-language-processing

16.10

  • https://gh.mltrainings.ru/presentations/KulaginDbrain2018.pdf
  • https://gh.mltrainings.ru/presentations/KuzinDLCompetitionsStory2018.pdf
  • https://gh.mltrainings.ru/presentations/KayumovDSCompetitions2018.pdf

10.10

  • elmo at apple: https://machinelearning.apple.com/2018/09/27/can-global-semantic-context-improve-neural-language-models.html
  • https://github.com/MorvanZhou/Tensorflow-Tutorial
  • expose blackbox: https://github.com/tsterbak/pydata2018-amsterdam/blob/master/presentation.ipynb
  • elmo with keras: https://github.com/UKPLab/elmo-bilstm-cnn-crf

09.10

  • https://dvc.org/features
  • https://www.facebook.com/pytorch/videos/169366590639145/

08.10

  • https://github.com/YuyangZhangFTD/awesome-RecSys-papers
  • https://github.com/MorvanZhou
  • https://github.com/SkalskiP/ILearnDeepLearning.py
  • https://ml.informatik.uni-freiburg.de/papers/18-AUTOML-AutoChallenge.pdf

03.10

  • https://www.onlinemathtraining.com/wp-content/uploads/2016/04/Math-for-Machine-Learning-Book-Preview.pdf
  • http://leananalyticsbook.com/wp-content/uploads/2013/01/Analytics-Lessons-Learned.pdf
  • https://blogs.rstudio.com/tensorflow/posts/2018-09-26-embeddings-recommender/

02.10

  • CVTraining better then ELMO? https://arxiv.org/abs/1809.08370
  • https://machinelearning.apple.com/2018/09/27/can-global-semantic-context-improve-neural-language-models.html

29.09

  • http://www.fast.ai/2018/09/26/ml-launch/
  • https://github.com/parrt/animl

27.09

  • https://databricks.com/blog/2015/06/02/statistical-and-mathematical-functions-with-dataframes-in-spark.html
  • https://blogs.rstudio.com/tensorflow/posts/2018-09-26-embeddings-recommender/
  • https://medium.com/feature-labs-engineering/featuretools-on-spark-e5aa67eaf807

26.09

  • https://christophm.github.io/interpretable-ml-book/index.html
  • https://github.com/roamanalytics/roamresearch/blob/master/BlogPosts/Categoricalvariablesintreemodels/categoricalvariablespost.ipynb
  • https://goku.me/blog/EHR?utmcampaign=DataElixir

25.09

  • http://datajournalismhandbook.org/1.0/en/index.html
  • https://opinionator.blogs.nytimes.com/2010/04/25/chances-are/
  • https://www2.cs.duke.edu/courses/spring15/compsci216/lectures/04-stats.pdf

24.09

  • ranksums check correlated features: https://www.kaggle.com/aantonova/797-lgbm-and-bayesian-optimization
  • https://www.inovex.de/fileadmin/files/Vortraege/2018/bridging-the-gap-from-data-science-to-production-europython2018-wilhelm.pdf
  • https://github.com/Santosh-Gupta/Research2Vec
  • https://towardsdatascience.com/elmo-embeddings-in-keras-with-tensorflow-hub-7eb6f0145440
  • https://sites.google.com/a/ucsc.edu/krumholz/teaching-and-courses/ast119_w15/class-10
  • http://hanj.cs.illinois.edu/cs412/bk3/08.pdf
  • https://roamanalytics.com/2016/10/28/are-categorical-variables-getting-lost-in-your-random-forests/

21.09

  • https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45189.pdf
  • https://towardsdatascience.com/elmo-embeddings-in-keras-with-tensorflow-hub-7eb6f0145440
  • https://github.com/deepmipt/DeepPavlov

20.09

  • http://hanj.cs.illinois.edu/cs412/bk3/08.pdf
  • https://algorithms-tour.stitchfix.com/#data-platform
  • https://githubengineering.com/towards-natural-language-semantic-code-search/
  • https://www.svds.com/pivoting-data-in-sparksql/
  • https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
  • https://kagglerank.azurewebsites.net/
  • http://www.pkbigdata.com/common/cmpt/2018%E7%A7%91%E5%A4%A7%E8%AE%AF%E9%A3%9EAI%E8%90%A5%E9%94%80%E7%AE%97%E6%B3%95%E5%A4%A7%E8%B5%9B_%E8%B5%9B%E4%BD%93%E4%B8%8E%E6%95%B0%E6%8D%AE.html

19.09

  • https://towardsdatascience.com/elmo-embeddings-in-keras-with-tensorflow-hub-7eb6f0145440

18.09

  • https://www.kaggle.com/ogrellier/feature-selection-with-null-importances
  • https://www.kaggle.com/aantonova/797-lgbm-and-bayesian-optimization

16.09

  • https://onnx.ai/
  • https://medium.com/@srnghn/the-mathematics-of-decision-trees-random-forest-and-feature-importance-in-scikit-learn-and-spark-f2861df67e3
  • https://roamanalytics.com/2016/10/28/are-categorical-variables-getting-lost-in-your-random-forests/
  • https://gist.github.com/rnowling/fa6f1007e3547c75f8b2

13.09

  • https://towardsdatascience.com/simtext-2nd-solution-for-cikm-analyticup-2018-b3347e026e67
  • http://blog.madhukaraphatak.com/spark-vector-to-numpy/
  • https://github.com/zziz/pwc

11.09

  • http://www.ams.org/notices/199502/golubitsky.pdf
  • https://nbviewer.jupyter.org/github/lazarusA/CodeSnippets/blob/master/CodeSnippetsPython/SymmetricChaos.ipynb

08.09

  • hyperbolic RS: https://arxiv.org/pdf/1809.01703.pdf

07.09

  • https://medium.com/@matsutton/repurchase-rate-the-most-overlooked-ecommerce-kpi-337bccde184b

04.09

  • life metaphor: noise vs signal: https://www.johndcook.com/blog/2013/10/28/remove-noise-remove-signal/
  • erf and cdf (normal) https://www.johndcook.com/erfandnormal_cdf.pdf
  • https://medium.com/data-science-school/practical-apache-spark-in-10-minutes-part-6-graphx-9cc953afa487
  • best paper kdd: https://medium.com/syncedreview/kdd-2018-announces-best-paper-other-awards-4835ab8475a4
  • https://habr.com/company/eastbanctech/blog/422093/
  • https://github.com/GINK03/kaggle-dae
  • https://www.business-school.ed.ac.uk/crc/wp-content/uploads/sites/55/2017/02/Credit-Scoring-and-the-Optimization-Concerning-Area-Under-the-Curve-Anne-Kraus-and-Helmut-K%C3%BCchenhoff.pdf

28.08

  • kdd wrap up: https://habr.com/company/mailru/blog/421041/
  • bayesian reasoning: https://github.com/bayesgroup/deepbayes-2018/blob/master/day1_bayesian-reasoning/presentation.pdf

27.08

  • normality test with kurtosis: http://www.columbia.edu/~ld208/psymeth97.pdf
  • botanical prime: https://www.c82.net/work/?id=352
  • https://glowingpython.blogspot.com/2017/04/solving-two-spirals-problem-with-keras.html
  • http://bit.ly/beautifulObjectJupyterCon
  • https://docs.google.com/presentation/d/1n2RlMdmv1p25Xy5thJUhkKGvjtV-dkAIsUXP-AL4ffI/edit#slide=id.g362da5805701

23.08

  • https://sites.google.com/view/kdd2018-tutorial/home/slides
  • knowledge distillation: https://www.youtube.com/watch?v=lSjBc1wSJMI
  • https://docs.google.com/presentation/d/17hylV84mAY6-0uhYxFcI59nsiYoev4nTESUvoFlrA/edit#slide=id.g389fd03f420112
  • https://hackernoon.com/towards-ai-how-long-does-it-take-you-to-go-from-idea-to-working-prototype-a-day-a-month-8a03ffecca0a
  • http://rsos.royalsocietypublishing.org/content/5/5/171274

22.08

  • stats and sport https://statsbylopez.com/276labs/
  • cs229 https://stanford.edu/~shervine/teaching/cs-229.html

21.08

  • ncsoft blade & soul churn prediction https://arxiv.org/pdf/1802.02301.pdf
  • bayesian intro: https://www.datascience.com/blog/introduction-to-bayesian-inference-learn-data-science-tutorials

20.08

  • churn data science game https://arxiv.org/pdf/1802.02301.pdf
  • https://speakerdeck.com/teoliphant/ml-in-python?slide=46
  • Murphy law: anything that can go wrong will go wrong https://en.wikipedia.org/wiki/Murphy%27s_law
  • https://alexanderdyakonov.wordpress.com/2018/07/30/%D0%B1%D0%B0%D0%B9%D0%B5%D1%81%D0%BE%D0%B2%D1%81%D0%BA%D0%B8%D0%B9-%D0%BF%D0%BE%D0%B4%D1%85%D0%BE%D0%B4/
  • https://github.com/springcoil/PyDataLondonTutorial/blob/master/notebooks/LogisticRegScikitlearn.ipynb

18.08

  • http://brohrer.github.io/howbayesianinference_works.html
  • https://docs.google.com/presentation/d/1325yenZP_VdHoVj-tU0AnbQUxFwb8Fl1VdyAAUxEzfg/edit#slide=id.p

17.08

  • https://github.com/ipython-books/cookbook-2nd
  • http://tuvalu.santafe.edu/~simon/br.pdf

16.08

  • 3 schools of data http://slides.com/springcoil/
  • https://github.com/springcoil/PyDataLondonTutorial/blob/master/notebooks/Statistics.ipynb
  • https://github.com/godatadriven/os-training-materials

15.08

  • TrueSkill2 : https://www.microsoft.com/en-us/research/uploads/prod/2018/03/trueskill2.pdf
  • https://blog.ycombinator.com/learning-math-for-machine-learning/
  • https://github.com/tensorflow/model-analysis
  • https://anvaka.github.io/greview/hands-on-ml/1/

14.08

  • large to small better than small to large: http://koaning.io/variable-selection-in-machine-learning.html
  • bayesian is good https://blog.datank.ai/how-i-learned-to-stop-worrying-and-love-uncertainty-fd13c23442b6
  • think bayesian: http://www.greenteapress.com/thinkbayes/thinkbayes.pdf
  • bayesian for hackers: https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers

13.08

  • http://cbl.eng.cam.ac.uk/pub/Intranet/MLG/ReadingGroup/BronskillinferNET.pdf
  • Tim is good: http://timvieira.github.io/
  • https://imaddabbura.github.io/blog/machine%20learning/data%20science/2018/03/15/predicting-loan-repayment.html
  • gumbel max trick: https://arxiv.org/abs/1611.01144
  • love uncertainty: https://github.com/arinarmo/love_uncertainty/blob/master/slides.pdf
  • Vincent talk: https://www.youtube.com/watch?v=dE5j6NW-Kzg

10.08

  • http://datagenetics.com/blog/february52018/index.html
  • https://eng.uber.com/cota/
  • https://www.youtube.com/watch?v=Q2HLPCBStLQ

08.08

  • https://drive.google.com/file/d/14zSllcWPgsARqpF7D6haSF7M2PFsZY/view
  • https://drive.google.com/file/d/13e2bBwpncMshaMyUKkUno0T-K-Ott_/view
  • https://pycon.sg/news/slides/

07.08

  • https://towardsdatascience.com/7-recommendations-for-data-science-leaders-in-the-game-industry-3d82d45746d2
  • http://koaning.io/theme/notebooks/deep-ai-stupid.pdf
  • https://github.com/louridas/rwa/blob/master/content/notebooks/chapter_01.ipynb

06.08

  • http://cs230.stanford.edu/syllabus.html#midterm
  • https://www.youtube.com/watch?v=7CcSm0PAr-Y
  • http://cs230.stanford.edu/files/Deep%20Learning%20in%20Healthcare.pdf

03.08

  • https://github.com/GoogleCloudPlatform/tensorflow-without-a-phd
  • https://blog.ycombinator.com/learning-math-for-machine-learning/
  • https://chrisyeh96.github.io/2017/08/08/definitive-guide-python-imports.html
  • https://www.analyticsvidhya.com/blog/2018/07/infographic-common-mistakes-amateur-data-scientists-make-how-avoid-them

01.08

  • https://cdn-images-1.medium.com/max/1400/1*xbNM_CnEIWQtGbsLmZtE-A.gif
  • https://github.com/jhfjhfj1/autokeras

30.7

  • https://medium.com/activewizards-machine-learning-company/comparison-of-top-6-python-nlp-libraries-c4ce160237eb
  • https://github.com/natasha/ipyannotate

27.7

  • https://github.com/lancifollia/tinygbt/blob/master/tinygbt.py
  • http://cs231n.stanford.edu/slides/2017/cs231n2017lecture15.pdf
  • https://mapr.com/blog/churn-prediction-pyspark-using-mllib-and-ml-packages/
  • https://github.com/ChuckWoodraska/EurekaTrees
  • https://github.com/Microsoft/LightGBM/blob/master/examples/python-guide/plot_example.py

26.07

  • http://people.stat.sc.edu/haigang/improvement.html
  • https://docs.databricks.com/spark/latest/mllib/binary-classification-mllib-pipelines.html
  • https://www.coursera.org/lecture/machine-learning-applications-big-data/spark-ml-cross-validation-O0uKs

24.07

  • https://www.linkedin.com/pulse/beginners-ask-how-many-hidden-layersneurons-use-artificial-ahmed-gad/
  • https://towardsdatascience.com/july-edition-text-understanding-adaaff0bbd63
  • https://david-abel.github.io/blog/posts/misc/icml_2018.pdf

20.07

  • https://towardsdatascience.com/how-to-build-a-data-science-portfolio-5f566517c79c
  • https://david-abel.github.io/blog/posts/misc/icml_2018.pdf
  • https://drive.google.com/file/d/1Mw6JZ9k0e8ajfiQ8uI-VP2my96DJINr4/view
  • http://ecsocman.hse.ru/data/2012/06/06/1271384006/5.pdf

17.07

  • https://eng.lyft.com/from-shallow-to-deep-learning-in-fraud-9dafcbcef743
  • http://ecsocman.hse.ru/data/2012/06/06/1271384006/5.pdf

15.07

  • https://esc.fnwi.uva.nl/thesis/centraal/files/f244841390.pdf
  • http://deliprao.com/archives/294
  • https://pdfs.semanticscholar.org/584d/1e09f8e3fa359fcd2b9931bfc71d4672de3a.pdf
  • https://www.ijcai.org/proceedings/2017/0504.pdf
  • https://github.com/jeremyjordan/imbalanced-data/blob/master/Learning%20from%20imbalanced%20data.ipynb
  • https://www.jeremyjordan.me/imbalanced-data/
  • https://www.jeremyjordan.me/nn-learning-rate/

14.07

  • https://github.com/mathurinm/celer
  • https://www.slideshare.net/agramfort/icml-2018-reproducible-machine-learning-a-gramfort

11.07

  • https://rise.cs.berkeley.edu/blog/pandas-on-ray-early-lessons/
  • https://alexanderdyakonov.wordpress.com/2018/06/28/%D0%BF%D1%80%D0%BE%D1%81%D1%82%D1%8B%D0%B5-%D0%BC%D0%B5%D1%82%D0%BE%D0%B4%D1%8B-%D0%B0%D0%BD%D0%B0%D0%BB%D0%B8%D0%B7%D0%B0-%D0%B4%D0%B0%D0%BD%D0%BD%D1%8B%D1%85/

10.07

  • https://arxiv.org/pdf/1802.05365.pdf
  • https://thegradient.pub/nlp-imagenet/
  • http://www.fast.ai/2018/07/02/adam-weight-decay/
  • https://github.com/deepmipt/DeepPavlov

05.07

  • http://eric.univ-lyon2.fr/~ricco/cours/slides/PJ%20-%20en%20-%20machine%20learning%20avec%20scikit-learn.pdf

04.07

  • https://frnsys.com/ai_notes/
  • https://hackernoon.com/why-businesses-fail-at-machine-learning-fbff41c4d5db

29.06

  • https://deepsense.ai/keras-or-pytorch/
  • https://www.ijcai.org/proceedings/2017/0504.pdf
  • https://www.jeremyjordan.me/nn-learning-rate/

28.06

  • https://getstream.io/blog/factorization-recommendation-systems/
  • https://www.slideshare.net/stairlab/higherorder-factorization-machines5
  • https://www.cs.waikato.ac.nz/~fbravoma/deepnlptut.pdf

26.06

  • https://medium.com/datreeio/training-with-keras-mxnet-on-amazon-sagemaker-43a34bd668ca
  • https://medium.com/@richardchen_81235/custom-keras-model-in-sagemaker-277a2831ac67
  • https://github.com/awslabs/amazon-sagemaker-examples

25.06

  • https://www.predictiveanalyticsworld.com/patimes/wise-practitioner-predictive-analytics-interview-series-tauseef-rahman-at-mercer/9538/
  • https://arxiv.org/pdf/1704.04565.pdf
  • http://ruder.io/tracking-progress-nlp/
  • https://www.kdnuggets.com/2015/03/interview-josh-hemann-activision-big-data.html
  • https://www.kdnuggets.com/2015/03/interview-josh-hemann-activision-data-science.html

22.06

  • https://nlp.stanford.edu/pubs/hancock2018babble.pdf
  • https://tomaugspurger.github.io/modern-1-intro.html
  • https://einstein.ai/static/images/pages/research/decaNLP/decaNLP.pdf
  • https://www.ibm.com/developerworks/community/blogs/jfp/entry/ImplementingLibfmin_Keras?lang=en

21.06

  • https://einstein.ai/static/images/pages/research/decaNLP/decaNLP.pdf

20.06

  • horovod is coool: https://medium.com/searchink-eng/keras-horovod-distributed-deep-learning-on-steroids-94666e16673d
  • https://medium.com/product-at-catalant-technologies/using-lightfm-to-recommend-projects-to-consultants-44084df7321c
  • https://databricks.com/blog/2016/05/19/approximate-algorithms-in-apache-spark-hyperloglog-and-quantiles.html

19.06

  • multi gpus: https://datascience.stackexchange.com/questions/23895/multi-gpu-in-keras
  • https://keras.io/getting-started/faq/#how-can-i-run-a-keras-model-on-multiple-gpus
  • https://stackoverflow.com/questions/50096/how-to-pass-password-to-scp
  • https://stackoverflow.com/questions/31326015/how-to-verify-cudnn-installation
  • https://keras.io/utils/#multigpumodel
  • https://arxiv.org/pdf/1710.02262.pdf
  • http://scikit-learn.org/stable/autoexamples/preprocessing/plotall_scaling.html#sphx-glr-auto-examples-preprocessing-plot-all-scaling-py
  • https://www.pyimagesearch.com/2017/10/30/how-to-multi-gpu-training-with-keras-python-and-deep-learning/

18.06

  • https://github.com/mephistopheies/mlworkshop39042017/blob/master/3masterclass/ipy/feature_extraction.ipynb
  • https://pdfs.semanticscholar.org/fc72/59942d3d9d9f0d45565853755e74a983e028.pdf

15.06

  • toxic in russian https://www.youtube.com/watch?v=aMlpeDOjib8
  • multitask learning https://arxiv.org/pdf/1806.03713.pdf

14.06

  • http://www.gamedonia.com/blog/5-ways-to-calculate-lifetime-value-for-free-to-play-games
  • https://towardsdatascience.com/how-to-build-a-dynamic-garden-using-machine-learning-d589468f7c04
  • https://scholarspace.manoa.hawaii.edu/bitstream/10125/50002/1/paper0115.pdf

12.06

  • trieutrinh, google brain: https://github.com/tensorflow/models/tree/master/research/lm_commonsense
  • finetune transformer: https://github.com/openai/finetune-transformer-lm
  • https://blog.openai.com/language-unsupervised/

11.06

  • https://www.poly-ai.com/docs/naacl18.pdf
  • https://petewarden.com/2018/06/11/why-the-future-of-machine-learning-is-tiny/
  • https://threadreaderapp.com/
  • dontforget to check: https://gist.github.com/ttscoff/cded212ec4dd457186ca

09.06

  • http://dylan-chen.com/model/lightgbm-tutorial/

08.06

  • job taxonomy: https://www.youtube.com/watch?v=SWjIoRNTCdU
  • https://www.blog.google/topics/ai/ai-principles/
  • https://github.com/cvanweelden/sequencelabelingexample/blob/master/sequencelabelingexample.ipynb
  • textkernel: https://www.youtube.com/watch?v=xUxjW308CcI
  • https://github.com/mattilyra/LSH/blob/master/examples/Introduction.ipynb
  • https://www.youtube.com/watch?v=n3dCcwWV4_k&index=40&list=PLGVZCDnMOq0ovNxfxOqYcBcQOIny9Zvb-

07.06

  • http://forums.fast.ai/t/30-best-practices/12344/12
  • https://bgweber.github.io/
  • https://github.com/jacobeisenstein/gt-nlp-class/
  • https://towardsdatascience.com/statistics-for-people-in-a-hurry-a9613c0ed0b

06.06

  • https://github.com/datascienceinc/oreilly-intro-to-predictive-clv/blob/master/oreilly-an-intro-to-predictive-clv-tutorial.ipynb
  • http://brucehardie.com/notes/004/bgnbdspreadsheetnote.pdf
  • https://mattilyra.github.io/2017/05/23/document-deduplication-with-lsh.html
  • http://nbviewer.jupyter.org/github/mattilyra/LSH/blob/master/examples/Introduction.ipynb
  • wsdm 2018 papers: http://www.wsdm-conference.org/2018/accepted-papers.html
  • http://brucehardie.com/notes/
  • https://community.firstmarkcap.com/content/clv-in-e-commerce-2013-10-23
  • http://brucehardie.com/notes/004/bgnbdspreadsheetnote.pdf

05.06

  • https://www.tensorflow.org/hub/modules/google/universal-sentence-encoder/1
  • okcupid, basic stats: https://ww2.amstat.org/publications/jse/v23n2/kim.pdf

04.06

  • https://www.oreilly.com/learning/introduction-to-okrs
  • http://adigaskell.org/2015/06/15/reputation-the-sharing-economy-and-the-market-for-lemons/

02.06

  • RL: https://towardsdatascience.com/introduction-to-various-reinforcement-learning-algorithms-i-q-learning-sarsa-dqn-ddpg-72a5e0cb6287
  • https://towardsdatascience.com/introduction-to-various-reinforcement-learning-algorithms-part-ii-trpo-ppo-87f2c5919bb9
  • https://github.com/JannesKlaas/sometimesdeepsometimes_learning/blob/master/reinforcement.ipynb
  • why no mosaic plot in seaborn: https://www.perceptualedge.com/articles/visualbusinessintelligence/aremosaicplots_worthwhile.pdf
  • https://alexanderdyakonov.wordpress.com/2017/10/30/%D0%B2%D0%B8%D0%B7%D1%83%D0%B0%D0%BB%D0%B8%D0%B7%D0%B0%D1%86%D0%B8%D1%8F-%D1%87%D0%B0%D1%81%D1%82%D1%8C-1/
  • http://karpathy.github.io/2016/05/31/rl/

01.06

  • https://arxiv.org/pdf/1803.11175.pdf
  • https://databricks.com/blog/2017/10/19/introducing-natural-language-processing-library-apache-spark.html
  • https://petewarden.com/2018/05/28/why-you-need-to-improve-your-training-data-and-how-to-do-it/

29.05

  • https://elitedatascience.com/python-seaborn-tutorial

28.05

  • https://github.com/dennybritz/nn-from-scratch/blob/master/nn-from-scratch.ipynb
  • https://www.kdnuggets.com/2016/08/include-high-cardinality-attributes-predictive-model.html
  • https://www.forbes.com/sites/naomirobbins/2012/01/19/when-should-i-use-logarithmic-scales-in-my-charts-and-graphs/3/
  • https://github.com/ianozsvald/datasciencedelivered/blob/master/mlcreatingcorrectcapableclassifiers.ipynb
  • https://github.com/RobRomijnders/weight_uncertainty

26.05

  • https://chrisalbon.com/#deep_learning
  • https://github.com/IBMDecisionOptimization/tutorials/blob/master/jupyter/MachineLearningandCPLEX.ipynb

25.05

  • https://www.slideshare.net/SessionsEvents/misha-bilenko-principal-researcher-microsoft
  • https://medium.com/moonshot/how-to-install-faiss-c986fe474a8f
  • https://medium.com/ibm-data-science-experience/optimizing-a-marketing-campaign-moving-from-predictions-to-actions-e39b8ab1f865
  • https://www.lunametrics.com/blog/2016/06/30/marketing-channel-attribution-markov-models-r/
  • https://github.com/IBMDecisionOptimization/tutorials/blob/master/jupyter/MachineLearningandCPLEX.ipynb
  • https://pocketphilosopher.net/2016/01/27/using-machine-learning-to-tune-a-game/
  • http://davidmlane.com/hyperstat/chi_square.html
  • https://medium.com/@erushton214/a-simple-spell-checker-built-from-word-vectors-9f28452b6f26

24.05

  • http://davidmlane.com/hyperstat/viswanathan/appreciation.html
  • https://storage.googleapis.com/pub-tools-public-publication-data/pdf/aad9f93b86b7addfea4c419b9100c6cdd26cacea.pdf
  • http://davidmlane.com/hyperstat/viswanathan/chisquaremarketing.html
  • https://www.statisticssolutions.com/chi-square-2/
  • Marketing – Are women more likely than men to buy a product online?
  • https://medium.com/@inlinecoder/disrupting-the-entrance-point-to-a-predictive-data-analytics-12676aa91a8d
  • https://github.com/alessiamarcolini/deep-learning_best-practices
  • https://github.com/reshamas/fastaideeplearnpart1

23.05

  • https://support.appsflyer.com/hc/en-us/articles/115002667326-Best-Practices-for-Detection-of-Mobile-Fraud
  • https://github.com/SeitaroShinagawa/FavoritePapers/blob/master/nlp.md

22.05

  • https://developers.google.com/machine-learning/rules-of-ml/
  • https://www.datasciencecentral.com/profiles/blogs/why-logistic-regression-should-be-the-last-thing-you-learn-when-b
  • http://vita.had.co.nz/papers/engineering-da.pdf
  • http://vita.had.co.nz/presentations.html

21.05

  • http://vita.had.co.nz/presentations.html
  • http://vita.had.co.nz/papers/tidy-data.pdf

18.05

  • https://peerj.com/collections/50-practicaldatascistats/
  • https://medium.com/indeed-data-science/theres-no-such-thing-as-a-data-scientist-8dae923c14e3
  • https://medium.com/indeed-data-science/marketing-for-data-science-a-7-step-go-to-market-plan-for-your-next-data-product-60c034c34d55
  • https://blog.ouseful.info/2016/09/13/making-music-and-embedding-sounds-in-jupyter-notebooks/
  • https://xcitech.github.io/tutorials/travelers/
  • https://github.com/jfpuget/LibFMinKeras/blob/master/keras_blog.ipynb

17.05

  • sentence piece, sub word: https://github.com/google/sentencepiece
  • fastai nlp with transfer learning: http://forums.fast.ai/t/part-2-lesson-10-wiki/14364
  • https://xcitech.github.io/tutorials/heroku_tutorial/
  • lime: https://homes.cs.washington.edu/~marcotcr/blog/lime/
  • http://nlp.fast.ai/
  • https://medium.com/activewizards-machine-learning-company/top-7-data-science-use-cases-in-finance-303c05a3cb58
  • https://www.oreilly.com/learning/introduction-to-local-interpretable-model-agnostic-explanations-lime

15.05

  • https://www.youtube.com/watch?v=mmLukrKMSnw

14.05

  • https://www.slideshare.net/Nordeus/early-churn-prediction-and-personalised-interventions-in-top-eleven-game
  • https://medium.com/googleplaydev/five-tips-to-improve-your-games-as-a-service-monetization-1a99cccdf21
  • http://www.cs.cmu.edu/~./dpelleg/download/yachurn.pdf

13.05

  • https://www.slideshare.net/TakanoriHayashi3/talkingdata-adtracking-fraud-detection-challenge-1st-place-solution
  • https://github.com/jfpuget/LibFMinKeras/blob/master/keras_blog.ipynb
  • https://github.com/vdutor/tf-rex

10.05

  • https://events.prace-ri.eu/event/686/material/slides/0.pdf
  • https://medium.com/@mrpowers/working-with-dates-and-times-in-spark-491a9747a1d2
  • https://github.com/MSusik/newgradientboosting/blob/master/pydata.pdf

09.05

  • https://cilab.sejong.ac.kr/gdmc2017/index.php/tutorial/
  • ds politics: https://www.rdisorder.eu/2017/09/13/most-difficult-thing-data-science-politics/
  • https://towardsdatascience.com/how-to-survive-corporate-politics-as-a-data-scientist-ba914fac2471
  • http://businessforecastblog.com/whats-the-lift-of-your-churn-model-predictive-analytics-and-big-data/
  • http://blog.datalifebalance.com/lift-charts-a-data-scientists-secret-weapon/

08.05

  • employee attrition https://www.youtube.com/watch?v=pviTahK6KuQ
  • https://s3.amazonaws.com/assets.datacamp.com/blogassets/PySparkSQLCheatSheet_Python.pdf

07.05

  • https://statsbot.co/blog/calculating-customer-lifetime-value-sql-example/
  • https://databricks.com/blog/2015/06/02/statistical-and-mathematical-functions-with-dataframes-in-spark.html
  • https://github.com/datatalesblog/Feature-Engineering-in-PySpark/blob/master/Value%20Investing%20PySpark%20Code.py
  • https://gist.github.com/anish749/6a815ed281f538068a0d3a20ca9044fa

02.05

  • make nnet uncool again: http://www.fast.ai/2018/04/29/categorical-embeddings/
  • https://pdfs.semanticscholar.org/8004/cd728305c9abb203cc09885c64fcc5e45f43.pdf

01.05

  • http://ianozsvald.com/
  • https://github.com/marcotcr/lime/blob/master/doc/notebooks/Tutorial%20-%20continuous%20and%20categorical%20features.ipynb
  • https://github.com/ianozsvald/datasciencedelivered/blob/master/mlexplainregression_prediction.ipynb

30.04

  • plot decision plane: https://github.com/arogozhnikov/MLatImperial2017/blob/master/utils.py
  • http://contest.ai-academy.ru/hackathon
  • https://github.com/alxmamaev/Dota2Competition/blob/master/Solution.ipynb
  • https://github.com/ikatsov/algorithmic-examples/blob/master/promotions/MarkovLTV.ipynb

29.04

  • https://www.slideshare.net/PyData/random-forests-best-practices-for-the-business-world
  • https://speakerd.s3.amazonaws.com/presentations/45e7e9769a17481c9957300105c45041/PyDataLondon2018FullFact.pdf

28.04

  • http://gael-varoquaux.info/interpretingmltuto/content/interpretingrandomforests.html#meaning-and-caveats
  • https://thuijskens.github.io/2017/10/07/feature-selection/
  • https://hackernoon.com/a-guide-to-scaling-machine-learning-models-in-production-aa8831163846
  • https://github.com/arogozhnikov/arogozhnikov.github.io/blob/master/notebooks/2015-09-29-NumpyTipsAndTricks1.ipynb

24.04

  • https://www.slideshare.net/amr_qura/neural-network-based-player-retention-prediction-in-free-to-play-games

23.04

  • https://towardsdatascience.com/the-fall-of-rnn-lstm-2d1594c74ce0
  • https://compscicenter.ru/media/slides/machinelearning22018spring/20180226machinelearning22018_spring.pdf
  • https://towardsdatascience.com/exploring-the-census-income-dataset-using-bubble-plot-cfa1b366313b
  • https://medium.com/scribd-data-science-engineering/multi-armed-bandits-for-the-win-240b71bc3464
  • https://github.com/ChenglongChen/tensorflow-XNN/blob/master/doc/MercariPriceSuggesionCompetitionChenglongChen4thPlace.pdf

20.04

  • https://medium.com/open-machine-learning-course/open-machine-learning-course-topic-2-visual-data-analysis-in-python-846b989675cd
  • http://www.statisticshowto.com/probability-and-statistics/skewed-distribution/
  • https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016/lecture-slides-and-files/index.htm
  • https://reshamas.github.io/to-kaggle-or-not/
  • http://learningsys.org/nips17/assets/papers/paper_11.pdf
  • chisquare: http://uregina.ca/~gingrich/ch10.pdf

19.04

  • https://medium.com/@keeper6928/how-to-unit-test-machine-learning-code-57cf6fd81765

18.04

  • http://marcotcr.github.io/lime/tutorials/Tutorial%20-%20continuous%20and%20categorical%20features.html
  • mean roc auc: http://scikit-learn.org/stable/autoexamples/modelselection/plotroccrossval.html
  • https://www.kdnuggets.com/2018/04/7-books-mathematical-foundations-data-science.html
  • https://www.appsflyer.com/blog/4-new-ways-use-boost-performance-audiences/

15.04

  • crosswire device matching/grouping https://www.youtube.com/watch?v=nfEDGY2siU8
  • https://www.apptamin.com/blog/lifetime-value-mobile-customer/
  • https://lloydmelnick.com/2013/01/08/ltv-the-lifeblood-of-your-business/
  • https://data36.com/wp-content/uploads/2016/08/practicaldatadictionaryfinaldata36tomimesterpublished.pdf

10.04

  • https://stats.stackexchange.com/questions/18844/when-and-why-should-you-take-the-log-of-a-distribution-of-numbers
  • https://www.coursera.org/learn/vvedenie-mashinnoe-obuchenie
  • https://github.com/mortido/mlbootcamponlinegame/blob/master/itog.py
  • http://www.machinelearning.ru/wiki/images/4/4f/Voron-ML-Modeling-slides.pdf
  • http://www.machinelearning.ru/wiki/images/archive/9/97/20140227072517!Voron-ML-Logic-slides.pdf
  • https://habrahabr.ru/post/324590/

09.04

  • https://www.digitaldoughnut.com/articles/2017/january/how-to-use-customer-lifetime-value-in-your-plan
  • https://github.com/mstephenmsmith/predictiveLTVanalysis
  • https://github.com/fastai/fastai/blob/master/courses/ml1/lesson3-rf_foundations.ipynb
  • https://alexanderdyakonov.wordpress.com/2017/10/30/%D0%B2%D0%B8%D0%B7%D1%83%D0%B0%D0%BB%D0%B8%D0%B7%D0%B0%D1%86%D0%B8%D1%8F-%D1%87%D0%B0%D1%81%D1%82%D1%8C-1/

06.04

  • https://github.com/harvardnlp/annotated-transformer/blob/master/The%20Annotated%20Transformer.ipynb
  • https://github.com/YixuanLi/LEMON

05.04

  • https://people.cs.umass.edu/~jpjiang/cs646/03evalbasics.pdf
  • https://towardsdatascience.com/facebook-research-just-published-an-awesome-paper-on-learning-hierarchical-representations-34e3d829ede7
  • https://www.saama.com/blog/poincare-embeddings-for-representing-hierarchical-data/
  • https://rare-technologies.com/implementing-poincare-embeddings/
  • https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/Poincare%20Tutorial.ipynb

04.04

  • https://academy.microsoft.com/en-us/professional-program/tracks/artificial-intelligence/

02.04

  • https://mlbootcamp.ru/news_list/
  • https://github.com/catboost/catboost/blob/master/catboost/tutorials/advancedtutorials/catboostcoremlexporttutorial.ipynb
  • https://developer.apple.com/documentation/coreml/integratingacoremlmodelintoyour_app

01.04

  • https://funmatu.wordpress.com/2017/11/02/hyperopt/

churn: - https://www.mapd.com/blog/VW-Predicts-Churn-with-GPU-Accelerated-Machine-Learning-and-visual-analytics - https://activewizards.com/blog/top-9-data-science-use-cases-in-banking/ - https://indico.cern.ch/event/617754/contributions/2590694/attachments/1459648/2254154/catboostforCMS.pdf

repeat purchase: - https://www.youtube.com/watch?v=kOqLbibOGus - https://github.com/PengInGitHub/Repeat-Buyer-Prediction-for-E-Commerce/blob/master/solution.pdf - https://www.slideshare.net/moa108/repeat-buyer-prediction-for-e-commerce-kdd2016

31.03

  • https://hackernoon.com/what-leading-artificial-intelligence-course-should-you-take-and-what-should-you-do-after-261a933bb3da
  • https://www.tensorflow.org/dev-summit/

30.03

  • https://zenodo.org/record/166035#.Wr4nGtNubOQ
  • http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0180735
  • http://textvis.lnu.se/
  • https://github.com/xvoland/Extract/blob/master/extract.sh
  • https://github.com/DmitryUlyanov/Multicore-TSNE/blob/master/MulticoreTSNE/examples/test.py

28.03

  • https://ahmedbesbes.com/how-to-mine-newsfeed-data-and-extract-interactive-insights-in-python.html
  • https://data36.com/reporting-optimizing-predicting-data/
  • https://data36.com/ab-testing-5-rules/
  • https://github.com/stared/livelossplot/blob/master/keras_example.ipynb

27.03

  • https://gist.github.com/jiffyclub/905bf5e8bf17ec59ab8f#file-hdftoparquet-py
  • https://nbviewer.jupyter.org/github/JasonKessler/Scattertext-PyData/blob/master/PyData-Scattertext-Part-1.ipynb

26.03

  • http://web.stanford.edu/class/ee380/Abstracts/141112-slides.pdf
  • https://data36.com/predictive-analytics-101-part-1/
  • https://data36.com/ab-testing-5-rules/
  • https://data36.com/fake-door-testing/

24.03

  • https://www.stat.berkeley.edu/~stark/Java/Html/index.htm
  • https://data36.com/statistical-bias-types-explained/

23.03

  • https://hackernoon.com/aspiring-data-scientists-start-to-learn-statistics-with-these-6-books-a33bbb55b8e9
  • https://towardsdatascience.com/catboost-vs-light-gbm-vs-xgboost-5f93620723db
  • https://github.com/Featuretools/predictnextpurchase/blob/master/Tutorial.ipynb

22.03

  • https://tonysyu.github.io/raw_content/matplotlib-style-gallery/gallery.html
  • https://github.com/Featuretools/predictnextpurchase/
  • https://github.com/josolnik/behavioral-learnings-projects/

21.03

  • https://biendata.com/competition/kdd_2018/
  • https://soundcloud.com/piskvorky/rrp-4-leo-boytsov-on-approximate-search-and-information-retrieval
  • https://mailchi.mp/radimrehurek/radims-machine-learning-newsletter-3263153
  • https://www.featuretools.com/

20.03

  • https://www.slideshare.net/ShangxuanZhang/kaggle-winning-solution-xgboost-algorithm-let-us-learn-from-its-author
  • https://docs.google.com/spreadsheets/d/1gnxAnJPpEz6hTukVPMNHDAjb0yCTEgzyo9NUiQjfc/edit#gid=1036524761

19.03

  • https://github.com/mm-mansour/Fast-Pandas
  • https://pdfs.semanticscholar.org/0203/6a9565159f19633c5de023321cdf422f43d3.pdf
  • https://medium.com/@joshelman/the-only-metric-that-matters-ab24a585b5ea
  • https://medium.com/@joshelman/the-only-metric-that-matters-ab24a585b5ea

18.03

  • https://www.urbanairship.com/blog/churn-prediction-our-machine-learning-model
  • mobile app events with ML: https://pdfs.semanticscholar.org/0203/6a9565159f19633c5de023321cdf422f43d3.pdf
  • http://josolnik.com/simulatingproductusage_data.html
  • http://www.graphviz.org/pdf/dotguide.pdf

16.03

  • https://medium.com/@pushkarmandot/https-medium-com-pushkarmandot-what-is-lightgbm-how-to-implement-it-how-to-fine-tune-the-parameters-60347819b7fc
  • https://docs.google.com/spreadsheets/d/1gnxAnJPpEz6hTukVPMNHDAjb0yCTEgzyo9NUiQjfc/edit#gid=1036524761
  • http://www.graphviz.org/pdf/dotguide.pdf

12.03 - https://s3.amazonaws.com/assets.datacamp.com/production/course3374/slides/ch3slides.pdf - https://www2.unil.ch/biomapper/Download/Lobo-GloEcoBioGeo-2007.pdf - https://github.com/Volodymyrk/stats-testing-in-python/blob/master/04%20-%20AB%20testing%20revenues.ipynb - https://github.com/anvaka/word2vec-graph - http://blog.minitab.com/blog/adventures-in-statistics-2/understanding-t-tests%3A-1-sample%2C-2-sample%2C-and-paired-t-tests

08.03 - https://distill.pub/2018/building-blocks/ - https://www.kaggle.com/anokas/talkingdata-adtracking-eda - https://github.com/PavelOstyakov/toxic/blob/master/fitpredict.py - https://github.com/MLWave/Kaggle-Ensemble-Guide/blob/master/src/kagglerankavg.py - https://mlwave.com/kaggle-ensembling-guide/ - https://www2.unil.ch/biomapper/Download/Lobo-GloEcoBioGeo-2007.pdf

07.03 - https://www.kdnuggets.com/2017/06/kmeans-clustering-tableau-call-detail-records.html - https://github.com/neptune-ml/kaggle-toxic-starter

05.03 - http://konukoii.com/blog/2018/02/19/twitter-sentiment-analysis-using-combined-lstm-cnn-models/ - https://www.kaggle.com/ogrellier/lgbm-with-words-and-chars-n-gram/code

04.03 - https://github.com/mxbi/mlcrate/blob/master/mlcrate/ensemble.py - datacamp.com/community - https://nbviewer.jupyter.org/github/repmax/topic-model/blob/master/topic-modelling.ipynb

01.03 - https://nlp.stanford.edu/pubs/sidaw12simplesentiment.pdf - http://www.abigailsee.com/2018/02/21/deep-learning-structure-and-innate-priors.html - http://cikm2017.org/download/analytiCup/session3/CIKMAnalytiCup2017LazadaProductTitleQualityT2.pdf - http://www.businessinsider.com/app-users-are-quick-to-uninstall-2016-11 - https://jjallaire.shinyapps.io/keras-customer-churn/#section-customer-scorecard - https://github.com/rstudio/keras-customer-churn - https://github.com/nzw0301/spooky/blob/master/features/bigramsupervisedfasttext.ipynb - https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-IMDB.ipynb

28.02 - https://github.com/slundberg/shap

27.02 - text classification with fastai https://www.youtube.com/watch?v=37sFIak42Sc&feature=youtu.be&t=3745 - https://nlp.stanford.edu/pubs/sidaw12simplesentiment.pdf - https://www.kaggle.com/jhoward/nb-svm-strong-linear-baseline - https://github.com/rstudio/keras-customer-churn

26.02 - https://github.com/deepmipt/DeepPavlov - orange cars are not lemon, really? http://cdn2.hubspot.net/hubfs/2176909/Resources/WhitepaperAreOrangeCarsReallynotLemons.pdf?submissionGuid=347eadf2-9cdb-48a1-8edc-7de4698c3d28 - http://python-3-patterns-idioms-test.readthedocs.io/en/latest/PythonForProgrammers.html

21.02 - http://www.cmap.polytechnique.fr/~lepennec/enseignement/DSSPOrange/ - doing data science: frontline - https://github.com/cstorm125/thai2vec/blob/master/notebooks/textclassification.ipynb - http://hamelg.blogspot.com/2015/11/python-for-data-analysis-part-24.html

20.02 - https://medium.mybridge.co/machine-learning-top-10-open-source-projects-v-feb-2018-d1d39062bd20 - https://tableplus.io/ - https://mailchi.mp/radimrehurek/radims-machine-learning-newsletter-1544193

13.02 - https://gist.github.com/iskandr/a874e4cf358697037d14a17020304535

09.02 - https://github.com/maciejkula/mixture - https://towardsdatascience.com/neural-network-architectures-156e5bad51ba - https://medium.com/@aldamiz/how-we-grew-from-0-to-4-million-women-on-our-fashion-app-with-a-vertical-machine-learning-approach-f8b7fc0a89d7 - https://cs224d.stanford.edu/reports/pascal.pdf - https://pinboard.in/u:aldamiz/t:re-read/ - https://medium.com/human-in-a-machine-world/mae-and-rmse-which-metric-is-better-e60ac3bde13d - employee quit prediction https://dzone.com/articles/employee-turnover-prediction-with-deep-learning

07.02 - https://github.com/the-deep-learners/TensorFlow-LiveLessons/ - https://github.com/jfloff/pywFM - Uber epxirement design: https://www.youtube.com/watch?v=9bl7SPSqbX0

06.02: - https://www.getrevue.co/profile/wildml/issues/the-wild-week-in-ai-andrew-ng-s-new-ai-fund-mini-alphago-implementation-bias-variance-in-rl-and-more-94390 - https://www.technologyreview.com/s/610095/more-efficient-machine-learning-could-upend-the-ai-paradigm/?utmsource=twitter.com&utmmedium=social&utmcontent=2018-02-05&utmcampaign=Technology+Review - http://christophjanz.blogspot.com/2012/05/know-your-user-cohorts.html

05.02

  • https://wsdm-cup-2018.kkbox.events/
  • https://medium.com/@nokkk/jupyter-notebook-tricks-for-data-science-that-enhance-your-efficiency-95f98d3adee4
  • https://wsdm-cup-2018.kkbox.events/pdf/5BGregoryWSDM2018_PredictingCustomerChurn.pdf
  • https://brage.bibsys.no/xmlui/bitstream/handle/11250/2433761/16128_FULLTEXT.pdf

02.02

  • https://github.com/Kaggle/kaggle-api
  • https://github.com/maciejkula/tripletrecommendationskeras
  • https://www.coursera.org/learn/nlp-sequence-models
  • https://arxiv.org/pdf/1703.01365.pdf

01.02

  • https://www.economist.com/news/leaders/21728617-life-age-facial-recognition-what-machines-can-tell-your-face
  • https://blog.openai.com/requests-for-research-2/

31.01

  • https://github.com/hiranumn/IntegratedGradients
  • https://github.com/valentina-s/Novice2DataNinja/blob/master/Videos.ipynb
  • https://www.pyimagesearch.com/2018/01/29/scalable-keras-deep-learning-rest-api/
  • https://github.com/maciejkula/tripletrecommendationskeras

30.01

  • https://medium.com/@ialuronico/my-take-on-pydata-seattle-2017-e8c7b0fa6bf5
  • https://github.com/JonathanRaiman/wikipedia_ner
  • https://github.com/nadiinchi/hsecsmlcourse2017FTAD/blob/master/materials/presentationvis_features.pdf
  • https://github.com/summer1227/appsflyer/blob/master/click_install.py
  • http://www.kdd.org/kdd2016/papers/files/adf0755-vanderveldAbr.pdf
  • https://www.youtube.com/watch?v=UmP3UePGO7E

29.01

  • https://www.kaggle.com/sudalairajkumar/simple-exploration-notebook-zillow-prize
  • https://www.kaggle.com/nikunjm88/creating-additional-features
  • https://github.com/nadiinchi/hsecsmlcourse2017FTAD/blob/master/materials/presentationvis_features.pdf
  • https://blog.metaflow.fr/tensorflow-how-to-optimise-your-input-pipeline-with-queues-and-multi-threading-e7c3874157e0

26.01

  • https://github.com/Microsoft/AutonomousDrivingCookbook/tree/master/AirSimE2EDeepLearning
  • https://blog.insightdatascience.com/how-to-solve-90-of-nlp-problems-a-step-by-step-guide-fda605278e4e

25.01

  • https://syncedreview.com/2017/06/06/re%C2%B7work-deep-learning-in-retail-summit-london-uk/
  • https://www.re-work.co/events/deep-learning-in-retail-summit-london-2017/schedule#day_2
  • https://medium.com/@Synced/customer-lifetime-value-prediction-using-embeddings-53f54e2ac59d
  • https://github.com/cseward/ngramlanguagemodel
  • https://arxiv.org/abs/1704.04110
  • https://arxiv.org/pdf/1702.02098.pdf
  • https://rare-technologies.com/implementing-poincare-embeddings/
  • http://blog.fastforwardlabs.com/2018/01/22/exploring-recommendation-systems.html?utmcampaign=Data%2BElixir&utmmedium=email&utmsource=DataElixir_166
  • https://booking.ai/how-booking-com-increases-the-power-of-online-experiments-with-cuped-995d186fff1d

22.01

  • https://thenextweb.com/artificial-intelligence/2018/01/10/you-think-it-and-a-robot-sees-it-the-future-is-here-with-mind-reading-ai/
  • hyperQA https://github.com/vanzytay/HyperQA

20.01

  • https://www.kaggle.com/steubk/fixing-typos
  • https://github.com/ChenglongChen/KaggleHomeDepot/blob/master/Code/Chenglong/googlespellingcheckerdict.py
  • https://github.com/ChenglongChen/Kaggle_HomeDepot/tree/master/Code/Chenglong
  • https://github.com/lystdo/Codes-for-WSDM-CUP-Music-Rec-1st-place-solution/blob/master/nn_structure.pdf

19.01

  • fashion relevant is not enough: https://arxiv.org/pdf/1406.3561.pdf
  • Yahoo portrait user: https://arxiv.org/pdf/1512.04912.pdf
  • predict buying intention: https://arxiv.org/pdf/1511.06247.pdf
  • realtime community detection: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188702

18.01

  • http://www.bayareabikeshare.com/assets/pdf/Bjorn.pdf
  • https://github.com/baumanab/BayAreaBikeShare
  • https://blog.modeanalytics.com/python-data-visualization-libraries/
  • http://thfield.github.io/babs/

17.01

  • https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/what-ai-can-and-cant-do-yet-for-your-business
  • https://github.com/mryab/webgames-ltv-prediction/blob/master/Webgames%20LTV%20Prediction-Android.ipynb

15.01

  • https://github.com/diefimov/MTH594_MachineLearning/tree/master/ipython
  • https://www.slideshare.net/RubensZimbres/portfolio-82-2017
  • https://github.com/deanwampler/JustEnoughScalaForSpark
  • https://www.datasciencecentral.com/profiles/blogs/business-intelligence-and-data-science-fuzzy-borders
  • https://ourworldindata.org/
  • https://github.com/primetang/pyflann
  • https://github.com/jfloff/pywFM
  • https://lab.getbase.com/pandarize-spark-dataframes/
  • https://towardsdatascience.com/understanding-feature-engineering-part-1-continuous-numeric-data-da4e47099a7b

12.01

  • https://blog.goodaudience.com/ai-in-2018-for-researchers-8955df0caaf9
  • https://github.com/pandas-profiling/pandas-profiling
  • http://support.minitab.com/en-us/minitab-express/1/help-and-how-to/modeling-statistics/regression/supporting-topics/basics/a-comparison-of-the-pearson-and-spearman-correlation-methods/
  • https://github.com/dipanjanS/practical-machine-learning-with-python

11.01

  • https://github.com/emilwallner/Screenshot-to-code-in-Keras

10.01

  • https://medium.com/wish-engineering/scaling-analytics-at-wish-619eacb97d16
  • http://www.awesomestats.in/
  • https://dataelixir.com/issues/164#start

08.01

  • http://www.predictiveanalyticsworld.com/patimes/uplift-modeling-making-predictive-models-actionable/8578/
  • http://www.cs.columbia.edu/~evs/papers/
  • http://www.nit.eu/czasopisma/JTIT/2012/2/43.pdf
  • https://github.com/PGuti/Uplift/blob/master/Uplift%20Evaluation.ipynb
  • http://www.predictiveanalyticsworld.com/book/press.php#articlesbytheauthor

04.01

  • https://spark-in.me/post/learn-data-science
  • https://habrahabr.ru/company/ods/blog/322626/
  • https://docs.google.com/spreadsheets/d/1dXghGL0hH6gs3H9Km7zhOpk9MWufRJ_bSrFw0NLaRuo/edit#gid=791694085
  • https://github.com/mmcs-sfedu/ds_workshop/blob/master/refs.md
  • http://www.inference.vc/
  • http://fastml.com/two-faces-of-overfitting-subscribers-only/
  • http://quantresearchgroup.ru/
  • https://livebook.datascienceheroes.com/index.html
  • https://github.com/esokolov/ml-course-msu
  • https://medium.freecodecamp.org/every-single-machine-learning-course-on-the-internet-ranked-by-your-reviews-3c4a7b8026c0
  • http://www.offconvex.org/
  • https://github.com/ogrisel/parallelmltutorial/blob/master/notebooks/08%20-%20Large%20Scale%20Text%20Classification%20for%20Sentiment%20Analysis.ipynb
  • https://github.com/shervinea

03.01

  • http://rahnamayan.ca/assets/documents/Customer%20Shopping%20Pattern%20Prediction-%20A%20Recurrent%20Neural%20Network%20Approach.pdf
  • http://eprints.bournemouth.ac.uk/10107/1/ConsumerBehaviourTheory-Approaches%26Models.pdf
  • https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks?utmcampaign=Data%2BElixir&utmmedium=email&utmsource=DataElixir_163
  • https://blog.keen.io/architecture-of-giants-data-stacks-at-facebook-netflix-airbnb-and-pinterest-9b7cd881af54
  • https://doogkong.github.io/2017/papers/paper2.pdf
  • https://pdfs.semanticscholar.org/a8cd/90fd6fce09f38a391579057d3207235a431b.pdf
  • http://www.marekrei.com/blog/ml-nlp-publications-in-2017/
  • http://www.aihelsinki.com/a-collection-of-tensorflow-resources-for-self-study/

02.01

  • https://github.com/blast-analytics-marketing/RFM-analysis
  • https://cran.r-project.org/web/packages/BTYD/vignettes/BTYD-walkthrough.pdf
  • http://www.blastam.com/blog/rfm-analysis-boosts-sales
  • http://cdn.intechopen.com/pdfs/13162.pdf

22.12

  • http://www.real-statistics.com/chi-square-and-f-distributions/one-sample-hypothesis-testing-variance/

21.12

  • https://github.com/Arturus/kaggle-web-traffic
  • https://oneau.wordpress.com/2011/02/28/simple-statistics-with-scipy/

20.12

  • https://datahack.analyticsvidhya.com/contest/all/
  • https://github.com/zixia/chinese-whispers
  • https://github.com/tudarmstadt-lt/sensegram/tree/master/chinese-whispers
  • https://github.com/zhly0/facenet-face-cluster-chinese-whispers-

18.12

  • nips review https://docs.google.com/spreadsheets/d/1ZQMXFAVapEOm1y53ijEJ1Ds6Mls-z6ZtoJKpJmHogzo/edit#gid=0
  • https://towardsdatascience.com/how-docker-can-help-you-become-a-more-effective-data-scientist-7fc048ef91d5
  • http://www.akbc.ws/2017/

17.12

  • http://proceedings.mlr.press/v7/guyon09/guyon09.pdf
  • http://proceedings.mlr.press/v7/miller09/miller09.pdf
  • https://ragulpr.github.io/assets/draftmasterthesismartinssonegilwtternn_2016.pdf
  • https://github.com/catboost/benchmarks/tree/master/quality_benchmarks

16.12

  • https://donjayamanne.github.io/pythonVSCodeDocs/docs/jupyter_getting-started/

15.12

  • http://lifelines.readthedocs.io/en/latest/Survival%20analysis%20with%20lifelines.html#estimating-the-survival-function-using-kaplan-meier
  • http://daynebatten.com/2015/02/customer-churn-survival-analysis/
  • survival model, princeton notes: http://data.princeton.edu/wws509/notes/
  • http://daynebatten.com/2017/02/recurrent-neural-networks-churn/
  • http://www.machinelearning.ru/wiki/index.php
  • http://www.machinelearning.ru/wiki/images/0/06/PZAD201603visualize.pdf
  • http://www.machinelearning.ru/wiki/images/c/cc/PZAD201609rf.pdf
  • http://www.machinelearning.ru/wiki/images/8/8e/PZAD201610tfeatures.pdf
  • http://www.machinelearning.ru/wiki/images/e/e7/PZAD201614social.pdf

14.12

  • https://github.com/daynebatten/keras-wtte-rnn
  • https://ragulpr.github.io/2016/12/22/WTTE-RNN-Hackless-churn-modeling/
  • https://github.com/erikbern
  • conversion rate, survival analysis: https://erikbern.com/2017/05/23/conversion-rates-you-are-most-likely-computing-them-wrong.html
  • https://data-literacy.geckoboard.com
  • https://erikbern.com/2017/12/12/learning-from-users-faster-using-machine-learning.html
  • https://github.com/UrbanInstitute/pyspark-tutorials
  • GP: http://bridg.land/posts/gaussian-processes-1
  • mining non redundant sequence https://arxiv.org/pdf/1712.04159.pdf

13.12

  • https://www.davidculley.com/installing-python-on-a-mac/
  • http://bomilanovich.com/blog/howto-install-pyqt-on-mac-with-python-3/
  • https://sascompetitions.ru/

12.12

  • https://rare-technologies.com/mummy-effect-bridging-gap-between-academia-industry/
  • http://ruder.io/deep-learning-optimization-2017/
  • dont decay learning rate, increase batch size: https://pdfs.semanticscholar.org/3299/aee7a354877e43339d06abb967af2be8b872.pdf
  • https://medium.com/@Synced/nips-2017-day-1-2-highlights-67ab464086c

11.12

  • http://learningsys.org/nips17/assets/slides/dean-nips17.pdf
  • https://www.datascience.com/resources/notebooks/overview-churn-modeling-techniques
  • https://swarbrickjones.wordpress.com/2017/03/28/cross-entropy-and-training-test-class-imbalance/

10.12

  • https://www.datascience.com/resources/notebooks/overview-churn-modeling-techniques

07.12

  • bayesian variable explanation: https://www.kdnuggets.com/2017/11/bayesian-networks-understanding-effects-variables.html
  • end2end ML/DL https://aws.amazon.com/sagemaker/ (colab?)
  • test of time https://www.youtube.com/watch?time_continue=2&v=Qi1Yry33TQE

06.12

  • http://proceedings.mlr.press/v7/niculescu09/niculescu09.pdf
  • https://www.nature.com/articles/d41586-017-07522-z
  • http://www.wiseathena.com/pdf/wa_dl.pdf

05.12

  • https://www.dataiku.com/learn/guide/tutorials/churn-prediction.html
  • https://www.dataiku.com/solutions/use-cases/lifetime-value-optimisation/

04.12

  • scikit optimize https://www.youtube.com/watch?v=DGJTEBt0d-s
  • https://github.com/fmfn/BayesianOptimization/blob/master/examples/xgboost_example.py
  • https://www.dataapplab.com/wp-content/uploads/2017/05/DAL-Kaggle-cometition.pdf

02.12

online marketing applications - https://pydata.org/carolinas2016/schedule/presentation/23/ - https://github.com/maoting1223/pyconsg2016 - https://www.youtube.com/watch?v=gx6oHqpRgpY

01.12

  • https://github.com/latuannetnam/kaggle

30.11

  • https://hbr.org/2017/06/a-refresher-on-ab-testing
  • Reuters Tracer: https://arxiv.org/pdf/1711.04068.pdf
  • https://github.com/DmitryUlyanov/deep-image-prior
  • https://research.googleblog.com/2017/11/interpreting-deep-neural-networks-with.html

29.11

  • https://www.nytimes.com/2017/11/28/technology/artificial-intelligence-research-toronto.html
  • https://rare-technologies.com/machine-learning-hardware-benchmarks/
  • how xgboost handle nans: https://github.com/dmlc/xgboost/issues/21
  • https://github.com/ledmaster?tab=repositories
  • automata extraction from RNN https://arxiv.org/pdf/1711.09576.pdf
  • shap vis: https://github.com/slundberg/shap
  • http://www.cs.jhu.edu/~ayuille/courses/Stat161-261-Spring14/Big%20data%20are%20we%20making%20a%20big%20mistake%20-%20FT.pdf
  • http://cdn2.hubspot.net/hub/215445/file-1390429685-pdf/DIebook-HowtoBuildandLeadaWinningData_Team-1.pdf?t=1435065619454

28.11

  • https://www.aarki.com/blog/using-machine-learning-to-predict-campaign-performance
  • https://www.forbes.com/sites/forbesagencycouncil/2017/11/15/how-machine-learning-can-maximize-the-success-of-marketing-campaigns/3/#4bbeb8df7846

27.11

  • https://www.slideshare.net/DataRobot/featurizing-log-data-before-xgboost
  • https://www.slideshare.net/DataRobot/make-sense-out-of-data-with-feature-engineering
  • https://www.slideshare.net/KaiX/xavier-conort-datascience-sg-meetup-challenges-in-insurance-pricing
  • https://www.slideshare.net/KaiX/forecasting-techniques-data-science-sg
  • https://github.com/thiakx?tab=repositories
  • https://www.kdnuggets.com/2017/11/ng-deep-learning-specialization-21-lessons.html?utmcontent=buffera7008&utmmedium=social&utmsource=twitter.com&utmcampaign=buffer
  • https://www.slideshare.net/DataRobot/featurizing-log-data-before-xgboost
  • https://github.com/gigamailer/simplenin/blob/master/Mastering%20Feature%20Engineering%20%2528Early%20Release%2529-O%2527Reilly%25282016%2529.pdf
  • https://github.com/svegapons/kaggleairbnb/blob/master/codekeras.py

24.11

  • https://www.bloomberg.com/company/announcements/bloomberg-magic-machine-learning/
  • https://www.investopedia.com/terms/a/alpha.asp
  • http://web.nchu.edu.tw/~jodytsao/MarkegingG/IIR10-Sentiment%20Analysis.pdf
  • https://flyyufelix.github.io/2017/11/17/direct-future-prediction.html
  • https://medium.com/@jeffykao/more-than-a-million-pro-repeal-net-neutrality-comments-were-likely-faked-e9f0e3ed36a6

23.11

  • http://blog.paralleldots.com/data-science/breakthrough-research-papers-and-models-for-sentiment-analysis/?lipi=urn%3Ali%3Apage%3Adflagship3pulse_read%3BiC%2Fq1jhKSuCkAgj9YxVOuQ%3D%3D
  • https://github.com/Far0n/xgbfi

22.11

  • https://github.com/CleverTap/Analyticsdsarticles/tree/master/Data-Informed/Feature_Engineering
  • https://towardsdatascience.com/diary-of-a-data-scientist-at-booking-com-924734c71417
  • A/B testing at Booking https://arxiv.org/pdf/1710.08217.pdf
  • https://booking.ai/named-entity-classification-d14d857cb0d5
  • https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.mstats.winsorize.html
  • http://data-informed.com/how-to-improve-machine-learning-tricks-and-tips-for-feature-engineering/

21.11

  • https://www.analyticsvidhya.com/blog/2017/06/which-algorithm-takes-the-crown-light-gbm-vs-xgboost/
  • tune lightgbm: https://github.com/Microsoft/LightGBM/blob/master/docs/Parameters-Tuning.rst
  • https://github.com/mdda/compressing-word-embeddings/tree/master/notebooks
  • https://dashee87.github.io/deep%20learning/python/predicting-cryptocurrency-prices-with-deep-learning/
  • https://github.com/dashee87/blogScripts/blob/master/Jupyter/2017-11-20-predicting-cryptocurrency-prices-with-deep-learning.ipynb

17.11

  • https://medium.com/searchink-eng/keras-horovod-distributed-deep-learning-on-steroids-94666e16673d
  • https://us3.campaign-archive.com/?u=6a29d4cc0471455d38260b3cc&id=ddf2eee959
  • http://wangzhinan.com/2017/02/20/wsdm17-summary/#more

16.11

  • deep ensembling: https://cambridgespark.com/content/tutorials/neural-networks-tuning-techniques/index.html
  • https://www.technologyreview.com/s/609495/ai-can-be-made-legally-accountable-for-its-decisions/?utmsource=twitter.com&utmmedium=social&utmcontent=2017-11-15&utmcampaign=Technology+Review
  • https://beamandrew.github.io/deeplearning/2017/06/04/deeplearningworks.html
  • https://github.com/taolei87/rcnn
  • https://research.googleblog.com/2017/11/sling-natural-language-frame-semantic.html
  • model intepretation: https://blog.kjamistan.com/towards-interpretable-reliable-models/
  • https://github.com/cgnorthcutt/rankpruning
  • https://github.com/PAIR-code/facets/blob/master/facetsoverview/Overviewdemo.ipynb
  • https://github.com/google/sling

15.11

  • https://github.com/catboost/catboost/blob/master/catboost/tutorials/quoracatboostw2v.ipynb
  • https://spacy.io/usage/v2

14.11

  • https://github.com/PipelineAI/pipeline

13.11

  • https://github.com/kaz-Anova/StackNet
  • https://tech.yandex.com/catboost/doc/dg/concepts/python-referencecatboostclassifierfit-docpage/
  • https://machinelearning.apple.com/2017/10/01/hey-siri.html
  • MLConf SF 2017: https://www.slideshare.net/JuneAndrews/counter-intuitive-machine-learning-for-the-industrial-internet-of-things-81862870/1
  • https://www.slideshare.net/SessionsEvents
  • https://towardsdatascience.com/7-takeaways-from-mlconf-sf-1b2703db5ecb

10.11

  • what wrong with CNN: https://www.youtube.com/watch?v=rTawFwUvnLE
  • https://medium.com/@culurciello/deep-neural-network-capsules-137be2877d44

09.11

  • vizuka: https://github.com/0011001011/Vizuka
  • https://www.youtube.com/watch?feature=youtu.be&v=klYBPl1ljTQ&list=PLGVZCDnMOq0rjkF7p_F4qtaVJQnjK1oKT&app=desktop

08.11

  • pearson correlation: https://en.wikipedia.org/wiki/Pearsoncorrelationcoefficient
  • jensen inequality: https://en.wikipedia.org/wiki/Jensen%27s_inequality
  • ui2code: https://uizard.io/
  • https://pypi.python.org/pypi/textstat/
  • mse vs pearson correlation: http://www.bwgriffin.com/gsu/courses/edur8132/notes/Notes8c2_RegressionModelFit.pdf

3.11

  • https://hackernoon.com/latest-deep-learning-ocr-with-keras-and-supervisely-in-15-minutes-34aecd630ed8

2.11

  • https://github.com/XifengGuo/CapsNet-Keras/blob/master/CapsNet.py
  • outlier detection: http://bugra.github.io/work/notes/2014-03-31/outlier-detection-in-time-series-signals-fft-median-filtering/
  • actionable classification: https://arxiv.org/abs/1607.02501
  • https://www.youtube.com/watch?v=NOUMgThZ5UE
  • http://www.swisstext.org/#daeniken
  • http://people.inf.ethz.ch/ganeao/emnlp17deeped.pdf
  • http://www.swisstext.org/docs/2017/Presentation/daeniken/swisstextpiusvon_daeniken.pdf
  • http://www.swisstext.org/docs/2017/Presentation/pappas/swisstext17.pdf
  • http://www.swisstext.org/docs/2017/Presentation/pappas/swisstext17.pdf

1.11

  • two sample test, mean: https://www.isixsigma.com/tools-templates/hypothesis-testing/making-sense-two-sample-t-test/
  • two sample test, ratio: https://github.com/maoting1223/pyconsg2016
  • welchs test vs t student: http://daniellakens.blogspot.com/2015/01/always-use-welchs-t-test-instead-of.html

31.10

  • structure data: https://github.com/random-forests/tensorflow-workshop/blob/master/examples/07structureddata.ipynb
  • https://www.pyimagesearch.com/2017/10/30/how-to-multi-gpu-training-with-keras-python-and-deep-learning/
  • kaggle survey: LR first, tree second: https://www.kaggle.com/surveys/2017
  • fe best practice: https://www.quora.com/What-are-some-best-practices-in-Feature-Engineering
  • ppmi vs svd: https://github.com/piskvorky/wordembeddings/blob/master/runembed.py
  • class imbalance in cnn: https://arxiv.org/pdf/1710.05381.pdf
  • rnnvis: https://arxiv.org/pdf/1710.10777.pdf
  • task detection from email: https://medium.com/@rodrigo_23805/extracting-tasks-from-emails-first-challenges-86e7fbbf4672
  • interactive cm: https://rare-technologies.com/interactive-confusion-matrix-python/

30.10

  • radim newsletter http://us3.campaign-archive.com/?u=6a29d4cc0471455d38260b3cc&id=9f47229ab0
  • prodLDA in keras: https://github.com/nzw0301/keras-examples/blob/master/prodLDA.ipynb
  • prodLDA: https://openreview.net/pdf?id=BybtVK9lg
  • bounter: https://github.com/RaRe-Technologies/bounter
  • http://cikm2017.org/mainconschedule.html
  • http://gael-varoquaux.info/statsinpython_tutorial/
  • http://matthewrocklin.com/blog/work/2017/10/16/streaming-dataframes-1?utmcampaign=Data%2BElixir&utmmedium=email&utmsource=DataElixir_154
  • GA: http://blog.otoro.net/2017/10/29/visual-evolution-strategies/
  • nlp talk: https://www.cs.umb.edu/~twang/file/cs188_TongWang.pdf
  • http://yutori-datascience.hatenablog.com/entry/2017/10/29/205433

29.10

  • linguistic structure is back, acl 2017: http://www.abigailsee.com/2017/08/30/four-deep-learning-trends-from-acl-2017-part-1.html

28.10

  • https://www.bloomberg.com/graphics/2017-wall-street-robots/

27.10

  • https://www.kaggle.com/knowledgegrappler/magic-embeddings-keras-a-toy-example

26.10

  • Coursera kaggle: https://www.coursera.org/learn/competitive-data-science

25.10

  • how to start ML/DL/NLP https://drive.google.com/file/d/0B2cCJQ2_aOwjUmFnRko2QjRGelE/view
  • https://www.slideshare.net/lopusz/debugging-machinelearning
  • https://github.com/meereeum/lda2vec-tf
  • https://medium.com/@rchang/advice-for-new-and-junior-data-scientists-2ab02396cf5b
  • https://github.com/YuriyGuts/kaggle-quora-question-pairs/blob/master/notebooks/classify-lightgbm-cv-pred.ipynb

24.10

  • https://github.com/plaidml/plaidml
  • https://www.youtube.com/watch?v=G4uDBe28ryQ
  • https://github.com/ilkarman/DeepLearningFrameworks

23.10

  • https://github.com/leondz/entity_recognition

20.10

  • https://docs.google.com/presentation/d/1vFlR9QJ4v1XnRg0-sNhe01gZUjj1utDdAUHScjzOtI/edit#slide=id.g271203ffb62_8
  • http://matrixmultiplication.xyz
  • http://blog.yhat.com/posts/logistic-regression-python-rodeo.html

19.10

  • https://jeremykun.com/2016/04/18/singular-value-decomposition-part-1-perspectives-on-linear-algebra/
  • http://multithreaded.stitchfix.com/blog/2017/10/18/stop-using-word2vec/
  • https://github.com/uber/horovod

18.10

  • swish = x.sigmoid(x) https://arxiv.org/pdf/1710.05941.pdf
  • DrQA: document retriever, document reader: https://github.com/facebookresearch/DrQA
  • https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429

17.10

  • outlier detection: https://storage.googleapis.com/supplementalmedia/udacityu/3104648634/Hodge+AustinOutlierDetection_AIRE381.pdf
  • https://lilianweng.github.io/lil-log/2017/09/28/anatomize-deep-learning-with-information-theory.html
  • opening the black box of DNN: https://arxiv.org/pdf/1703.00810.pdf
  • information plane for DL: https://www.youtube.com/watch?v=bLqJHjXihK8
  • information theory with C.Olah: http://colah.github.io/posts/2015-09-Visual-Information/

16.10

  • https://developers.soundcloud.com/blog/soundclouds-data-science-process

15.10

  • nlp curator: https://github.com/Kyubyong/nlp_tasks
  • https://github.com/Kulbear/deep-learning-coursera

13.10

  • Information theory of DL https://www.youtube.com/watch?v=RKvS958AqGY
  • https://arxiv.org/pdf/1709.03856.pdf

12.10

  • https://github.com/facebookresearch/StarSpace
  • https://www.youtube.com/watch?v=aircAruvnKk&feature=youtu.be
  • https://research.googleblog.com/2017/10/tensorflow-lattice-flexibility.html

11.10

  • book stats learning of Hastie: https://web.stanford.edu/~hastie/CASIfiles/PDF/casi.pdf?utmcontent=bufferaea53&utmmedium=social&utmsource=linkedin.com&utm_campaign=buffer
  • http://www.recognition.mccme.ru/pub/RecognitionLab.html/slbook.pdf
  • https://www.ted.com/talks/jeremyhowardthewonderfulandterrifyingimplicationsofcomputersthatcan_learn
  • tsne map: https://artsexperiments.withgoogle.com/tsnemap/#2072.02,145.27,5710.37,2039.00,138.00,5689.00
  • http://people.cs.umass.edu/~brenocon/inlp2016/lectures/05,06-classif-scan.pdf
  • capsules https://research.google.com/pubs/pub46351.html
  • http://www.cs.toronto.edu/~fritz/absps/transauto6.pdf
  • http://cseweb.ucsd.edu/~gary/cs200/s12/Hinton.pdf

10.10

  • https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
  • https://github.com/esokolov/ml-course-hse/blob/master/2016-fall/lecture-notes/lecture11-dl.pdf

07.10

  • https://arxiv.org/pdf/1710.00027.pdf

05.10

  • emoji2 https://medium.com/huggingface/understanding-emotions-from-keras-to-pytorch-3ccb61d5a983
  • attention layer: https://gist.github.com/thomwolf/e309e779a08c1ba899514d44355cd6df#file-attentionlayerkeras-py

04.10

  • hard sigmoid: https://stackoverflow.com/questions/35411194/how-is-hard-sigmoid-defined
  • https://data.world/rickyhennessy/startup-names-and-descriptions/workspace/file?filename=startups.csv
  • position attention bi-lstm: https://arxiv.org/pdf/1703.10089.pdf
  • https://obilaniu6266h16.wordpress.com/2016/02/04/einstein-summation-in-numpy/
  • https://arxiv.org/pdf/1512.04916.pdf
  • oov: https://github.com/cheng6076/SNLI-attention/blob/master/oov_vec.py
  • https://fasttext.cc/blog/2017/10/02/blog-post.html
  • https://fasttext.cc/docs/en/language-identification.html
  • https://teachablemachine.withgoogle.com/
  • https://www.edvancer.in/machine-learning-vs-statistics/
  • https://www.slideshare.net/nikhildandekar/maintaining-high-quality-user-generated-content-through-machine-learning

03.10

  • https://statweb.stanford.edu/~candes/talks/Wald1.pdf
  • http://aimotion.blogspot.com/2011/11/machine-learning-with-python-logistic.html
  • https://arxiv.org/pdf/1705.08039.pdf
  • https://medium.com/@shanif/our-data-science-workflow-b974f30a124d
  • http://u.cs.biu.ac.il/~yogo/DepLing2017invited.pdf

02.10

  • https://github.com/DataScienceUB/DeepLearningfromScratch
  • https://medium.com/applied-data-science/new-r-package-the-xgboost-explainer-51dd7d1aa211
  • https://medium.com/@shanif/our-data-science-workflow-b974f30a124d
  • https://www.slideshare.net/GaelVaroquaux/computational-practices-for-reproducible-science

30.09

  • https://developers.google.com/machine-learning/glossary/
  • https://www.slideshare.net/GaelVaroquaux/computational-practices-for-reproducible-science
  • https://github.com/SSDS-Croatia/SSDS-2017
  • https://sites.google.com/site/ssdatascience2017/lecture-notes

29.09

  • feature selection multiple hypothesis testing: http://kelvinguu.com/posts/feature-selection-and-multiple-hypothesis-testing/
  • how to do feature selection correctly: http://kelvinguu.com/posts/why-naive-cross-validation-fails-at-feature-selection/
  • https://habrahabr.ru/post/326122/
  • http://soloro.ru
  • http://kelvinguu.com/
  • http://jakob.uszkoreit.net/
  • coarse to fine QA for long document: https://arxiv.org/pdf/1611.01839.pdf
  • generating sentences by editing prototypes: https://arxiv.org/pdf/1709.08878.pdf

28.09

  • http://ruder.io/optimizing-gradient-descent/
  • https://github.com/kuza55/keras-extras/blob/master/layers/DiffForest.py
  • https://arxiv.org/pdf/1702.08835.pdf
  • https://docs.google.com/presentation/d/1Ze7BAiWbMPyF0ax36D-aK00VfaGMGvvgD_XuANQW1gU/edit#slide=id.p
  • https://uima.apache.org/

27.09

  • https://arxiv.org/pdf/1608.01238.pdf
  • https://web.stanford.edu/~jurafsky/slp3/16.pdf
  • http://www.aclweb.org/anthology/N12-2009
  • https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf

25.09

  • brown cluster: https://arxiv.org/pdf/1608.01238.pdf
  • word sense: http://www.cs.columbia.edu/~mcollins/courses/6998-2011/lectures/yarowsky.pdf
  • http://www.derczynski.com/sheffield/papers/brown_impact.pdf
  • http://people.cs.georgetown.edu/cosc572/f16/21bdistslides.pdf
  • https://paulx-cn.github.io/blog/5th_Blog/

22.09

  • http://blog.kaggle.com/2017/09/21/instacart-market-basket-analysis-winners-interview-2nd-place-kazuki-onodera/

21.09

  • rossmann nnet https://arxiv.org/pdf/1604.06737.pdf
  • http://blog.kaggle.com/2016/01/22/rossmann-store-sales-winners-interview-3rd-place-cheng-gui/
  • https://kaggle2.blob.core.windows.net/forum-message-attachments/102102/3454/Rossmannnr1doc.pdf

19.09

  • memory augmented nnet for nlp: https://drive.google.com/file/d/0B9dqzboiV5u-UmxJQlJqcUl6anM/view
  • kaggle quora blog: https://indatalabs.com/blog/data-science/how-to-win-kaggle-competition

18.09

  • http://u.cs.biu.ac.il/~yogo/DepLing2017invited.pdf
  • http://newsletter.ruder.io/issues/nlp-news-review-of-emnlp-2017-analyzing-bias-google-brain-ama-dragnn-and-allennlp-72584

17.09

  • http://xrds.acm.org/blog/2017/07/power-wordnet-use-python/
  • https://simons.berkeley.edu/sites/default/files/docs/5950/2017.02.01-21.15.12-simons-nlp-tutorial.pdf
  • talking to machine: http://cs.stanford.edu/~pliang/papers/talking-xrds2014.pdf
  • zero learning talk: https://www.youtube.com/watch?v=6O5sttckalE

16.09

  • https://github.com/philipperemy/tensorflow-class-activation-mapping

15.09

  • http://www.cs.tut.fi/kurssit/SGN-2556/slides/Lecture6.pdf

14.09

  • https://cloud.google.com/blog/big-data/2017/01/learn-tensorflow-and-deep-learning-without-a-phd

13.09

  • strong algos: GBT, RF, SVM for classification: https://arxiv.org/pdf/1708.05070.pdf
  • https://medium.com/slalom-engineering/detecting-malicious-requests-with-keras-tensorflow-5d5db06b4f28
  • https://github.com/tensorflow/workshops
  • https://github.com/chuckyee/cardiac-segmentation
  • real time CNN: https://github.com/lampts/faceclassification/blob/master/technicalreport.pdf

12.09

  • https://en.wikipedia.org/wiki/WhiteNoise(novel)
  • hitchhike guide to the galaxy:
  • https://www.cs.bgu.ac.il/~yoavg/uni/bloglike/baboons.html
  • http://u.cs.biu.ac.il/~yogo/courses/sem2017/

11.09

  • word embedding Komninos https://www.cs.york.ac.uk/nlp/extvec/
  • https://ku.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=0954a17c-2702-4d8e-9412-12ae958a2790
  • score distribution is better: https://arxiv.org/abs/1707.09861
  • make a stable architecture: https://arxiv.org/abs/1707.06799, pretrained embedding, last layer of lstm is crucial.
  • https://github.com/lanwuwei/paraphrase-dataset
  • why non convex: https://github.com/lanwuwei/paraphrase-dataset
  • https://www.reddit.com/r/dataisbeautiful/comments/6ykfvl/averagewordlengthfornytimescrosswordanswers/

10.09

  • dilated convnet https://medium.com/@TalPerry/convolutional-methods-for-text-d5260fd5675f
  • quora view: https://www.quora.com/challenges#views

09.09

  • https://ydkahin.github.io/blog/views-prediction---a-quora-challenge---part-iii-eda-feature-engineering-and-more/?utmcontent=buffera82c7&utmmedium=social&utmsource=twitter.com&utmcampaign=buffer
  • https://github.com/Unbabel/
  • https://andre-martins.github.io/docs/emnlp2017_final.pdf
  • http://allennlp.org/tutorials/configuration

08.09

  • https://www.eff.org/ai/metrics
  • http://courses.wcupa.edu/rbove/Berenson/10th%20ed%20CD-ROM%20topics/section12_5.pdf
  • percy liang: http://shrdlurn.sidaw.xyz/acl16/
  • https://www.youtube.com/watch?v=mhHfnhh-pB4
  • https://manning-content.s3.amazonaws.com/download/d/bcdc8c6-3f2e-4a2d-974b-487fc1da7cdf/CholletDLwPythonMEAPV05ch1.pdf
  • http://ofir.io/Neural-Language-Modeling-From-Scratch/
  • https://www.thoughtco.com/normal-approximation-to-the-binomial-distribution-3126589

07.09

  • https://www.thoughtco.com/normal-approximation-to-the-binomial-distribution-3126589
  • http://www.stat.purdue.edu/~xuanyaoh/stat350/xyJan23Lec4.pdf
  • https://github.com/tensorflow/serving/blob/master/tensorflowserving/example/mnistclient.py
  • https://medium.com/towards-data-science/how-to-deploy-machine-learning-models-with-tensorflow-part-2-containerize-it-db0ad7ca35a7
  • https://medium.com/towards-data-science/how-to-deploy-machine-learning-models-with-tensorflow-part-3-into-the-cloud-7115ff774bb6
  • https://github.com/Vetal1977/tfservingexample
  • https://github.com/udacity/deep-learning/blob/master/semi-supervised/semi-supervisedlearning2_solution.ipynb

06.09

  • https://medium.com/towards-data-science/how-to-deploy-machine-learning-models-with-tensorflow-part-2-containerize-it-db0ad7ca35a7
  • https://medium.com/zendesk-engineering/how-zendesk-serves-tensorflow-models-in-production-751ee22f0f4b
  • https://github.com/lampts/deep-learning-with-python-notebooks/blob/master/3.5-classifying-movie-reviews.ipynb

05.09

  • ds interview: http://www.thedsinterview.com/
  • 4 trends: structure is back, re embedding, blackbox transparency, attention: http://www.abigailsee.com/2017/08/30/four-deep-learning-trends-from-acl-2017-part-2.html
  • https://github.com/UKPLab/emnlp2017-relation-extraction
  • intepret rnn: https://github.com/philipperemy/tensorflow-isan-rnn

04.09

  • http://theorangeduck.com/page/neural-network-not-working
  • https://dzone.com/articles/natural-language-processing-adit-deshpande-cs-unde
  • https://github.com/ddtm/dl-course

03.09

  • http://multithreaded.stitchfix.com/blog/2017/08/31/warehouse-layouts/
  • https://machinelearningmastery.com/diagnose-overfitting-underfitting-lstm-models/

02.09

  • https://github.com/AlexandreRobicquet?tab=repositories
  • https://pillbox.nlm.nih.gov/developer.html#images

01.09

  • http://artemis-ml.readthedocs.io/en/latest/plotting.html
  • https://github.com/krystianity/keras-serving
  • https://github.com/Lausbert/Exermote/tree/master/ExermotePreprocessingAndTraining

31.08

  • http://liufuyang.github.io/2017/04/02/just-another-tensorflow-beginner-guide-4.html
  • https://github.com/Lausbert/Exermote/blob/master/ExermotePreprocessingAndTraining/trainer/exermote.py
  • http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.LeaveOneOut.html

30.08

  • effective tf: https://github.com/vahidk/EffectiveTensorflow
  • knn and bilstm https://arxiv.org/pdf/1708.07863.pdf
  • https://nlp.stanford.edu/pubs/jia2017adversarial.pdf
  • https://github.com/dformoso/machine-learning-mindmap

29.08

  • https://medium.com/@burgalon/deploying-your-keras-model-using-keras-js-2e5a29589ad8

28.08

  • https://nlp.stanford.edu/courses/cs224n/2015/reports/29.pdf
  • https://becominghuman.ai/cheat-sheets-for-ai-neural-networks-machine-learning-deep-learning-big-data-678c51b4b463?_lrsc=ce853194-65af-4e5e-a424-7d21025fd0c9
  • https://blog.fineighbor.com/tensorflow-dealing-with-imbalanced-data-eb0108b10701
  • https://arxiv.org/pdf/1707.05127.pdf

26.08

  • https://github.com/zalandoresearch/fashion-mnist/blob/master/README.md
  • https://github.com/DrMichaelWang/KaggleCancerProject/blob/master/Kaggle%20cancer%20-%20text%20key%20word%20frequency%20count_xgboost.ipynb

25.08

  • http://krisztianbalog.com/
  • https://medium.com/@erogol/designing-a-deep-learning-project-9b3698aef127
  • https://github.com/idiap/importance-sampling
  • http://krisztianbalog.com/files/talks/russir2016-el.pdf
  • https://github.com/kbalog/russir2016-el

24.08

  • http://blog.rtwilson.com/how-to-rescue-lost-code-from-a-jupyteripython-notebook/
  • http://maxberggren.se/2017/06/18/deep-learning-vs-xgboost/
  • http://beamandrew.github.io/deeplearning/2017/06/04/deeplearningworks.html

22.08

  • https://gist.github.com/menshikh-iv/0c691219314da35f48f10826b6d34d97
  • https://github.com/minimaxir/reactionrnn
  • http://www.kdnuggets.com/2017/08/oreilly-nyc-ai-conference-highlights.html
  • https://speakerdeck.com/tmylk/pycon-russia-2017-tiematichieskoie-modielirovaniie-dlia-liudiei
  • http://newsletter.ruder.io/issues/nlp-news-data-selection-ml-nlp-in-esports-vqa-bias-lyric-annotations-68803
  • https://github.com/fchollet/keras/releases/tag/2.0.7

21.08

  • https://github.com/rasbt/python-machine-learning-book-2nd-edition
  • https://github.com/sjvasquez/instacart-basket-prediction

18.08

  • http://evexdb.org/pmresources/vec-space-models/

17.08

  • https://medium.com/the-mission/a-genius-explains-how-to-be-creative-claude-shannons-long-lost-1952-speech-fbbcb2ebe07f

16.08

  • http://mltrainings.ru/
  • asap https://github.com/ddofer/asap/wiki/Getting-Started:-A-Basic-Tutorial
  • https://arxiv.org/pdf/1701.08318.pdf
  • genome modeling: https://cs224d.stanford.edu/reports/jessesz.pdf
  • https://www.reddit.com/r/MachineLearning/comments/6tu9gu/whatistheprocessofdeployingmachine_learning/?st=j6ee7uoq&sh=12c17107
  • https://github.com/chrisranderson/beholder
  • https://github.com/rasbt/deep-learning-book

15.08

  • http://bayes.wustl.edu/etj/prob/book.pdf

14.08

  • https://github.com/experiencor/deep-viz-keras
  • https://github.com/facebookresearch/SentEval
  • http://machinelearningmastery.com/reproducible-results-neural-networks-keras/
  • https://github.com/rasbt/deep-learning-book/blob/master/code/model_zoo/file-queues.ipynb
  • https://github.com/nlml/np-to-tf-embeddings-visualiser/blob/master/save_embeddings.py

13.08

  • https://github.com/vahidk/EffectiveTensorflow

11.08

  • http://chri.stophr.be/
  • https://github.com/nadbordrozd/text-top-model/tree/master/ttm/keras_models
  • https://tryolabs.com/blog/2017/08/10/finding-the-right-representation-for-your-nlp-data/
  • https://www.mira.law/blogposts/2017/5/12/semantic-averaging-of-documents-using-word2vec-representations

10.08

  • https://github.com/cpury/lstm-math

09.08

  • dl course: https://www.coursera.org/specializations/deep-learning

08.08

  • roc auc: http://www.navan.name/roc/
  • https://worksheets.codalab.org/worksheets/0x50757a37779b485f89012e4ba03b6f4f/
  • https://nlp.stanford.edu/pubs/jia2016recombination.pdf

07.08

  • best paper ICML: https://github.com/mlresearch/v70
  • https://explosion.ai/blog/prodigy-annotation-tool-active-learning
  • https://github.com/brannondorsey/kerasweightanimator
  • https://github.com/keveman/tensorflow-tutorial/blob/master/PTB%20Word%20Language%20Modeling.ipynb

06.08

  • https://github.com/brannondorsey/kerasweightanimator
  • https://github.com/pavitrakumar78/Anime-Face-GAN-Keras
  • https://code.facebook.com/posts/289921871474277/transitioning-entirely-to-neural-machine-translation/
  • https://prodi.gy/demo
  • https://prodi.gy/docs/

04.08

  • emoji transfer learning: https://arxiv.org/pdf/1708.00524.pdf
  • http://deepmoji.mit.edu/
  • importance sampling https://arxiv.org/pdf/1706.00043.pdf
  • larochelle https://drive.google.com/file/d/0ByUKRdiCDK7-LXZkM3hVSzFGTkE/view
  • bengio https://drive.google.com/file/d/0ByUKRdiCDK7-UXB1R1ZpX082MEk/view

01.08

  • pca with jake http://nbviewer.jupyter.org/github/jakevdp/PythonDataScienceHandbook/blob/master/notebooks/05.09-Principal-Component-Analysis.ipynb
  • https://openreview.net/pdf?id=HyaF53XYx

31.07

  • http://casa.disi.unitn.it/~moschitt/Teaching-slides/slides-AINLP-2016/NER&POS-AINLP.pdf
  • noise in feature space: https://openreview.net/pdf?id=HyaF53XYx
  • data augmentation using thesaurus: https://arxiv.org/pdf/1509.01626.pdf
  • https://theneuralperspective.com/
  • http://casa.disi.unitn.it/~moschitt/since2013/2015SIGIRSeveryn_TwitterSentimentAnalysis.pdf
  • https://einstein.ai/research/state-of-the-art-deep-learning-model-for-question-answering
  • https://sigmoidal.io/boosting-your-solutions-with-nlp/
  • http://www.fast.ai/2017/07/28/deep-learning-part-two-launch/
  • https://huyenchip.com/2017/07/28/confession.html
  • https://blog.slavv.com/37-reasons-why-your-neural-network-is-not-working-4020854bd607

25.07

  • how to ensemble https://mlwave.com/kaggle-ensembling-guide/
  • https://www.slideshare.net/TedXiao/winning-kaggle-101-dmitry-larkos-experiences
  • http://togelius.blogspot.se/2017/07/some-advice-for-journalists-writing.html
  • https://sadanand-singh.github.io/posts/treebasedmodels/
  • regression with keras: https://www.datacamp.com/community/tutorials/deep-learning-python

24.05

  • data readiness: https://arxiv.org/pdf/1705.02245.pdf
  • trophy data scientist: https://peadarcoyle.wordpress.com/2017/07/23/avoiding-being-a-trophy-data-scientist/
  • best paper cvpr 17: https://arxiv.org/pdf/1608.06993.pdf, https://github.com/liuzhuang13/DenseNet
  • https://github.com/titu1994/DenseNet
  • https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf

23.07

  • https://github.com/greydanus/excitation_bp

22.07

  • https://medium.com/huggingface/state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30
  • https://github.com/bloomberg/scatteract
  • http://gree2.github.io/ocr/2017/03/08/tesseract-ocr-parser-within-tika

21.07

  • https://www.youtube.com/watch?v=5sQ8-Er8tXM
  • https://github.com/HouJP/kaggle-quora-question-pairs
  • http://www.andreykurenkov.com/writing/a-brief-history-of-neural-nets-and-deep-learning/

20.07

  • https://github.com/hollance/YOLO-CoreML-MPSNNGraph

19.07

  • https://pjreddie.com/darknet/yolo/
  • ridge lr: http://www.utstat.toronto.edu/~guerzhoy/303/lec/lec8/ridge.pdf
  • https://github.com/catboost/catboost/tree/master/catboost/tutorials

18.07

  • https://kiko01b.wordpress.com/2011/07/16/replace-a-word-containing-a-slash-with-sed/
  • https://stackoverflow.com/questions/11392478/how-to-replace-a-string-in-multiple-files-in-linux-command-line
  • https://blog.keras.io/the-limitations-of-deep-learning.html
  • https://github.com/pair-code/facets

17.07

  • https://medium.com/@anandr42/the-data-science-delusion-7759f4eaac8e
  • https://gist.github.com/menshikh-iv/0c691219314da35f48f10826b6d34d97
  • http://www.fast.ai/2016/12/08/org-structure/
  • https://github.com/sarchak/MachineLearningNotebooks
  • nn for ir: https://arxiv.org/pdf/1707.04242.pdf
  • https://github.com/LeiG/Applied-Predictive-Modeling-with-Python

15.07

  • http://www.vjsonline.org/scientist-portrait/1500039392
  • https://github.com/jeongyoonlee/data-science-process-management

14.07

  • large csv: http://pythondata.com/working-large-csv-files-python/
  • https://arimo.com/data-science/2016/bayesian-optimization-hyperparameter-tuning/
  • bo https://github.com/phvu/misc/blob/master/sfcrimes/crimesjob_nn.py
  • foolbox https://arxiv.org/abs/1707.04131
  • http://www.aifounded.com/aifounded/recent-evolution-of-the-qa-datasets-and-going-forward/
  • https://gist.github.com/thomasjungblut/b58d70d260abf0eff1a8c447f3d07389#file-xgbbayesopt_cv-py
  • http://www.bosatsu.net/talks/sletten-datascience.pdf
  • https://github.com/dipanjanS/text-analytics-with-python/blob/master/Chapter-6/document_similarity.py

13.07

  • http://static.squarespace.com/static/51156277e4b0b8b2ffe11c00/t/53ad86e5e4b0b52e4e71cfab/1403881189332/AppliedPredictiveModelinginR.pdf
  • https://github.com/minimaxir/predict-reddit-submission-success
  • https://www.google.com/finance/company_news?q=NASDAQ%3AFB&ei=ZA5nWaCMMImFsAG8p4ewCw

12.07

  • https://github.com/organisciak/Text-Mining-Course
  • http://news.efinancialcareers.com/uk-en/285249/machine-learning-and-big-data-j-p-morgan?utmcontent=buffer29288&utmmedium=social&utmsource=twitter.com&utmcampaign=buffer
  • https://twitter.com/search?q=%23machinelearningflashcards&src=tyah

10.07

  • https://github.com/Wrosinski/berlin-ml-article
  • https://github.com/saulpw/visidata/blob/stable/docs/tours.rst
  • social emnlp: https://twitterinadvertising.files.wordpress.com/2017/02/tweeted-about-742-times.pdf
  • good pointers on nn: https://drive.google.com/file/d/0ByUKRdiCDK7-UXB1R1ZpX082MEk/view
  • https://github.com/0xnurl/kerascharacterbased_ner
  • https://www.aclweb.org/mirror/emnlp2016/proceedings/2016-emnlp-handbook.pdf

06.07

  • https://nlp.stanford.edu/software/crf-faq.shtml
  • Redcatlab: http://www.redcatlabs.com/2015-11-24IES-2015NER-from-Experts/
  • embedding compression http://sei.pku.edu.cn/~moull12/paper/cikm16.pdf
  • https://github.com/facebookresearch/InferSent

Maxout:

  • https://github.com/philipperemy/tensorflow-maxout/blob/master/maxout.py
  • https://arxiv.org/pdf/1302.4389.pdf

05.07

  • working with text for social: https://de.dariah.eu/tatom/
  • clickbait: https://github.com/saurabhmathur96/clickbait-detector
  • http://blog.echen.me/2012/01/03/introduction-to-conditional-random-fields/
  • http://nbviewer.jupyter.org/github/tpeng/python-crfsuite/blob/master/examples/CoNLL%202002.ipynb
  • CRF: https://arjoonn.blogspot.com/2016/01/prerequisites-for-conditional-random.html
  • NYT 1M: https://drive.google.com/file/d/0B0CbnDgKi0PyM1FEQXJRTlZtSTg/view
  • https://github.com/davidsbatista/NER-English-Gigaword-LDC
  • https://github.com/andreasvlachos/ALTAMLfor_NLP

04.07

  • https://www.slideshare.net/RasmusRothe/3-learnings-from-applying-deep-learning-to-real-world-problems
  • pytorch vs tf: https://medium.com/@dubovikov.kirill/pytorch-vs-tensorflow-spotting-the-difference-25c75777377b
  • https://github.com/Franck-Dernoncourt/NeuroNER/blob/master/trained_models/performances.md
  • http://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/
  • https://offbit.github.io/how-to-read/

03.07

  • https://mathematical-coffees.github.io/mc07-ml/
  • ran: http://www.kentonl.com/pub/llz.2017.pdf
  • https://www.microsoft.com/en-us/research/wp-content/uploads/2017/06/fntir-neuralir-mitra.pdf
  • https://github.com/bdhingra/ga-reader/blob/master/model/GAReader.py
  • https://github.com/allenai/deepqa/tree/master/deepqa/layers
  • Gate for QA: https://arxiv.org/pdf/1606.01549.pdf
  • TWINE https://www.aclweb.org/anthology/E/E17/E17-3007.pdf
  • 30 nlp interview questions: https://www.analyticsvidhya.com/blog/2017/07/30-questions-test-data-scientist-natural-language-processing-solution-skilltest-nlp/
  • mlss: http://nuit-blanche.blogspot.com/2017/06/slides-machine-learning-summer-school.html
  • network analysis: http://i.stanford.edu/~jure/pub/talks2/leskovec-networks-01-nodes.pdf
  • dl: http://mlss.tuebingen.mpg.de/2017/speakerslides/Ruslan1.pdf, http://mlss.tuebingen.mpg.de/2017/speakerslides/Ruslan2.pdf
  • https://offbit.github.io/how-to-read/

02.07

  • http://ianozsvald.com/2017/07/01/kaggles-mercedes-benz-greener-manufacturing/
  • https://github.com/atveit/GANforiPhoneWithCoreML/blob/master/GAN.ipynb
  • https://www.raywenderlich.com/164213/coreml-and-vision-machine-learning-in-ios-11-tutorial
  • http://www.cs.nyu.edu/shasha/papers/StatisticsIsEasyExcerpt.html
  • http://www.physics.csbsju.edu/stats/

30.06

  • http://yerevann.com/a-guide-to-deep-learning/
  • https://github.com/stitchfix/seetd
  • https://github.com/minimaxir/facebook-page-post-scraper
  • https://github.com/rykov8/ssd_keras
  • https://github.com/yhenon/keras-frcnn
  • https://github.com/niderhoff/nlp-datasets
  • http://yerevann.github.io/2016/09/21/presentation-sentence-representations-and-question-answering/

29.06

  • scorecard application: https://www.linkedin.com/pulse/credit-risk-scorecard-monitoring-tracking-shailendra
  • http://cds.nyu.edu/wp-content/uploads/2014/04/bertinidatascienceshowcaseMay122014.pdf
  • annotation tool: https://github.com/RicardoUsbeck/QRTool
  • ned dataset: https://datahub.io/dataset/reuters-128-nif-ner-corpus

28.06

  • wsd: https://web.stanford.edu/class/cs224n/reports/2762042.pdf
  • speech and lang processing: http://www.cs.colorado.edu/~martin/slp.html
  • nlp course: http://naviglinlp.blogspot.com/2017/
  • ted dunning: http://aclweb.org/anthology/J93-1003

- http://tdunning.blogspot.com/2008/03/surprise-and-coincidence.html

  • ll calculation: http://ucrel.lancs.ac.uk/llwizard.html
  • http://www.prooffreader.com/2014/12/most-decade-specific-words-in-billboard.html
  • https://github.com/Prooffreader/data-science-blogs
  • http://www.prooffreader.com/2015/05/most-characteristic-words-in-pro-and.html
  • https://github.com/zafarali?tab=repositories

27.06

  • http://nikolenko.livejournal.com/275253.html
  • CRF survey: http://nlpx.net/archives/464
  • https://github.com/LopezGG/NNNERtensorFlow
  • https://medium.com/hockey-stick/tl-dr-bayesian-a-b-testing-with-python-c495d375db4d
  • https://alexanderdyakonov.files.wordpress.com/2017/06/bookboostingpdf.pdf
  • https://github.com/backstopmedia/tensorflowbook
  • csi with tf: http://web.stanford.edu/class/cs20si/syllabus.html
  • rnn in excel https://docs.google.com/spreadsheets/d/18bkheoJbmMUqdRFrviUy_TiooSjvvpDqiti7hm2EASY/edit#gid=0
  • why elu not (relu) http://www.picalike.com/blog/2015/11/28/relu-was-yesterday-tomorrow-comes-elu/
  • https://medium.com/@timanglade/how-hbos-silicon-valley-built-not-hotdog-with-mobile-tensorflow-keras-react-native-ef03260747f3
  • https://gist.github.com/J-DM

26.06

  • is it significant? http://www.ox.ac.uk/media/global/wwwoxacuk/localsites/uasconference/presentations/P8Isitstatisticallysignificant.pdf
  • PSI: http://ucanalytics.com/blogs/population-stability-index-psi-banking-case-study/
  • loan credit: http://ucanalytics.com/blogs/data-visualization-case-study-banking/
  • FE: https://courses.cit.cornell.edu/cs5304/Lectures/lec5_FeatureEngineering.pdf
  • https://github.com/maciejkula/recommender_datasets
  • EL: https://github.com/namkhanhtran/EntityLinkingRetrieval-ELR
  • https://github.com/raghakot/keras-vis
  • https://gh.mltrainings.ru/presentations/SemenovTinkoffChallenge2017.pdf
  • http://ucanalytics.com/blogs/information-value-and-weight-of-evidencebanking-case/

24.06

  • https://github.com/klarsen1/Information

23.06

  • http://multithreaded.stitchfix.com/blog/2015/08/13/weight-of-evidence/
  • WOE: https://github.com/patrick201/information_value
  • https://github.com/akashgit/autoencodingvifortopicmodels
  • https://github.com/carpedm20/variational-text-tensorflow
  • AVITM https://openreview.net/pdf?id=BybtVK9lg
  • https://www.hackerearth.com/practice/machine-learning/advanced-techniques/winning-tips-machine-learning-competitions-kazanova-current-kaggle-3/tutorial/
  • gensim 2.2.3 https://github.com/RaRe-Technologies/gensim/releases/tag/2.2.0
  • tkm quora solution: https://www.slideshare.net/tkm2261/quora-76995457
  • http://yutori-datascience.hatenablog.com/

22.06

  • https://github.com/lampts/kaggle-quora-solution-8th
  • https://github.com/Far0n/xgbfi
  • http://microposts2016.seas.upenn.edu/challenge.html
  • https://github.com/wikilinks/nel/blob/master/notebooks/train.ipynb
  • http://www.semantic-web-journal.net/system/files/swj1562.pdf
  • https://github.com/jeniyat/TweeTime

21.06

  • sentiment corpus: https://www.w3.org/community/sentiment/wiki/Datasets
  • paper2code: https://github.com/daviddao/awesome-very-deep-learning
  • https://handong1587.github.io/deep_learning/2015/10/09/rnn-and-lstm.html
  • task benchmark https://www.eff.org/ai/metrics
  • http://willwolf.io/2017/06/15/random-effects-neural-networks/
  • http://colah.github.io/posts/2015-08-Backprop/
  • http://sdsawtelle.github.io/blog/output/getting-started-with-tensorflow-in-jupyter.html
  • https://arxiv.org/pdf/1611.05418.pdf

19.06

  • all you need is attention: https://github.com/Kyubyong/transformer
  • http://damiano.github.io/learning-similarity-functions-ORM/
  • https://github.com/abhishekkrthakur/clickbaits_revisited
  • entity filtering and topic detection: thesis-DamianoSpina.pdf
  • https://alexanderdyakonov.files.wordpress.com/2017/06/bookboostingpdf.pdf
  • https://github.com/ejmeij/entity-linking-and-retrieval-tutorial

14.06

  • automating FE, OneBM: https://arxiv.org/pdf/1706.00327.pdf
  • imbalance sklearn: https://glemaitre.github.io/talks/2017_PyParis/#1
  • feature selection: http://www.kdnuggets.com/2017/06/practical-importance-feature-selection.html
  • https://groups.google.com/a/tensorflow.org/forum/#!msg/discuss/Dhy9MseSXQI/naoy_EElBAAJ
  • https://github.com/curiousily
  • EL and ER: https://www.dropbox.com/sh/h7fr4yfrih6tisr/Q9BU8Qshcq?lst=

13.06

  • https://github.com/ageron/handson-ml
  • http://ft-interactive.github.io/visual-vocabulary/
  • https://phvu.net/2016/05/13/count-featurizer/

12.06

  • https://www.slideshare.net/HJvanVeen/kaggle-presentation
  • https://medium.com/udacity/launching-astra-fab2b76b6420
  • https://medium.com/@curiousily/tensorflow-for-hackers-part-ii-building-simple-neural-network-2d6779d2f91b
  • http://alexanderdyakonov.narod.ru/lpot4emu.pdf
  • https://github.com/turboNinja2/Homesite/blob/master/SubmissionsKeras.py

09.06

  • https://medium.com/@yoav.goldberg/an-adversarial-review-of-adversarial-generation-of-natural-language-409ac3378bd7
  • https://www.slideshare.net/HJvanVeen/feature-engineering-72376750

07.06

  • https://medium.com/@curiousily/tensorflow-for-hackers-part-ii-building-simple-neural-network-2d6779d2f91b
  • rnn in excel: https://docs.google.com/spreadsheets/d/18bkheoJbmMUqdRFrviUy_TiooSjvvpDqiti7hm2EASY/edit#gid=316082502
  • http://nlp.cs.rpi.edu/paper/sigmod2016.pdf
  • http://distill.pub/2016/augmented-rnns/
  • http://xren7.web.engr.illinois.edu/KDD15-ClusType_v3.pdf
  • gp: https://github.com/phvu/misc/blob/master/bayesopt/gaussian_process.py

05.06

  • https://github.com/Franck-Dernoncourt/NeuroNER

02.06

  • https://github.com/kailashahirwar/cheatsheets-ai/blob/master/All%20Cheat%20Sheets.pdf
  • https://docs.microsoft.com/en-us/cognitive-toolkit/Using-CNTK-with-Keras

01.06

  • https://github.com/georgeiswang/QueryClassficationLSTM
  • https://www.oreilly.com/ideas/language-understanding-remains-one-of-ais-grand-challenges
  • AI and NLP: https://www.xenonstack.com/blog/overview-of-artificial-intelligence-and-role-of-natural-language-processing-in-big-data
  • http://ndres.me/kaggle-past-solutions/
  • https://github.com/UKPLab/semeval2017-scienceie
  • http://www.nada.kth.se/~ann/exjobb/jan_vandekerkhof.pdf
  • https://blog.booking.com/multivariant-tests-for-performance.html
  • https://www.ambiverse.com/make-your-news-smarter/
  • learn to search: https://hunch.net/~l2s/merged.pdf

31.05

  • https://dennisforbes.ca/#a302
  • http://www.namedevelopment.com/blog/default.html
  • http://www.telegraph.co.uk/finance/personalfinance/comment/4478124/The-name-game.html
  • http://new.opencalais.com/wp-content/uploads/2016/01/Thomson-Reuters-Intelligent-Tagging-On-Premise-API-User-Guide.pdf
  • http://cs231n.stanford.edu/slides/2017/cs231n2017lecture10.pdf

30.05

  • http://tkipf.github.io/graph-convolutional-networks/
  • http://deeploria.gforge.inria.fr/thomasTalk.pdf
  • graph cnn https://github.com/tkipf/gcn
  • deeploria: http://deeploria.gforge.inria.fr/
  • dedupe: https://github.com/dedupeio/dedupe
  • http://sebastianruder.com/multi-task/index.html
  • https://arxiv.org/pdf/1705.09585.pdf
  • https://clgiles.ist.psu.edu/pubs/jcdl2015-name-disambiguation.pdf

29.05

  • why PReLU, maxout: http://cs231n.stanford.edu/slides/2017/cs231n2017lecture6.pdf

26.05

  • https://medium.springboard.com/interesting-talks-from-pydata-london-2017-d17b06c1ed5e
  • https://github.com/DistrictDataLabs/yellowbrick
  • https://github.com/lucjb/pydata2017/blob/master/Multicolinearity.py
  • https://github.com/cavaunpeu/dotify/blob/master/notebooks/neuralimplicitmf.ipynb

25.05

  • https://www.zanaducloud.com/CC6612B2-B42A-4765-A0C8-4FDB3CEF50E2
  • http://willwolf.io/2017/05/18/minimizingthenegativeloglikelihoodinenglish/
  • https://github.com/cavaunpeu/dotify/blob/master/notebooks/neuralimplicitmf.ipynb

21.05

  • data interview: https://github.com/talolard/Interview
  • https://medium.com/@nikasa1889/a-guide-to-receptive-field-arithmetic-for-convolutional-neural-networks-e0f514068807
  • https://medium.com/@TalPerry/convolutional-methods-for-text-d5260fd5675f

20.05

  • https://github.com/nelson-liu/paraphrase-id-tensorflow

19.05

  • https://github.com/Microsoft/LightGBM/wiki/Installation-Guide
  • https://github.com/ArdalanM/pyLightGBM

18.05

  • https://github.com/Franck-Dernoncourt/NeuroNER
  • https://128.84.21.199/pdf/1705.06273.pdf

17.05

  • http://www.jaist.ac.jp/~bao/DS2017/BigData-I-Dinh-v4-4perPage.pdf

16.05

  • https://www.youtube.com/watch?v=HS7mObQttxU
  • https://en.wikipedia.org/wiki/BLEU
  • http://www.mathsisfun.com/data/quincunx.html

15.05

  • https://github.com/hengluchang/Quora-Paraphrase-Question-Identification
  • online w2v: https://markroxor.github.io/gensim/static/notebooks/onlinew2vtutorial.html
  • https://en.wikipedia.org/wiki/Smith%E2%80%93Waterman_algorithm
  • http://climberg.de/page/smith-waterman-distance-for-feature-extraction-in-nlp/

13.05

  • https://blog.dataiku.com/2015/08/24/xgboostanddss
  • https://gist.github.com/walterreade/6e20dba959277bd9af77
  • https://github.com/lucjb/pydata2017/blob/master/Multicolinearity.py
  • https://github.com/christophebourguignat/notebooks/blob/master/Calibration.ipynb

12.05

  • https://github.com/christophebourguignat/notebooks
  • https://www.kaggle.com/tqchen/understanding-xgboost-model-on-otto-data#script-save-run
  • http://education.parrotprediction.teachable.com/p/practical-xgboost-in-python
  • https://github.com/makeyourowntextminingtoolkit/makeyourowntextminingtoolkit
  • https://docs.google.com/presentation/d/1ukZMzz4rNN0MHegTNgjwLAI-kMWL1mGZPkp1bUCVckc/edit#slide=id.g21fc752465074

11.05

  • https://en.wikiquote.org/wiki/Xmeno_Xs#English
  • https://www.ff.umb.sk/app/cmsSiteAttachment.php?ID=2348

10.05

  • A/B test common pitfalls: https://www.youtube.com/watch?v=NkQ51iyFgs0

09.05

  • http://www.aclweb.org/anthology/N10-1021

08.05

  • https://github.com/benathi/word2gm
  • http://mirnazim.org/writings/python-ecosystem-introduction/
  • https://speakerdeck.com/marcobonzanini/static-type-analysis-for-robust-data-products-at-pydata-london-2017
  • https://github.com/konradczechowski/discopt/blob/master/discoptgeneralusage.ipynb
  • high order fm: https://arxiv.org/pdf/1607.07195.pdf
  • https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/OperaSlides.pdf
  • https://pydata.org/london2017/schedule/
  • http://www.kemaswill.com/uncategorized/from-matrix-factorization-to-factorization-machines/
  • https://kaggle2.blob.core.windows.net/competitions/kddcup2012/2748/media/OperaSlides.pdf
  • https://github.com/geffy/tffm

05.05

  • https://github.com/geffy/tffm
  • ds handbook: https://github.com/jakevdp/PythonDataScienceHandbook
  • https://github.com/bstriner/keras-tqdm
  • https://github.com/src-d/wmd-relax
  • https://github.com/krasch/presentations/blob/master/unittestingdata_science.pdf

04.05

  • https://www.kaggle.com/wangyijia/xgboost-tfidf-logloss-0-3/comments/code
  • https://www.kaggle.com/jturkewitz/magic-features-0-03-gain/

03.05

  • https://github.com/stared/keras-sequential-ascii
  • https://github.com/abhishekkrthakur/clickbaits_revisited

02.05

  • https://github.com/tjpalanca/facebook-news-analysis

30.04

  • https://snap.stanford.edu/data/web-Amazon.html

27.04

  • https://howchoo.com/g/ytkwotvkztq/using-the-iterm-2-and-tmux-integration

26.04

  • bm25 implemention: https://github.com/alexeygrigorev/avito-duplicates-kaggle/blob/master/bm25.py
  • bm25 vs tfidf: https://lettier.github.io/posts/2016-10-25-tf-idf-vsm-vs-bm25-with-vuejs.html
  • https://kkulma.github.io/2017-04-24-determining-optimal-number-of-clusters-in-your-data/
  • https://www.kaggle.com/c/quora-question-pairs/discussion/32069#177710
  • https://www.reddit.com/r/MachineLearning/comments/67gonq/dbatchnormalizationbeforeorafterrelu/?st=j1y4j36m&sh=1d708b41

25.04

  • https://github.com/ChenglongChen/Kaggle_CrowdFlower/tree/master/Code/Feat
  • https://dnc1994.com/2016/05/rank-10-percent-in-first-kaggle-competition-en/
  • https://www.slideshare.net/HJvanVeen/feature-engineering-72376750
  • http://hotgram1.filmiro.com/2017/03/11/109/6118559518814109698.pdf

24.04

  • http://arogozhnikov.github.io/2016/04/28/demonstrations-for-ml-courses.html
  • https://github.com/Babylonpartners/fastText_multilingual
  • https://github.com/pYr0rAGE/KaggleQuoraQuestionSimilarity/blob/master/notebooks/Initial%20Analysis.ipynb
  • http://aylien.com/web-summit-2015-tweets-part1
  • https://github.com/pksohn/tweet-clustering
  • http://hdbscan.readthedocs.io/en/latest/howhdbscanworks.html

21.04

  • https://www.kaggle.com/arthurtok/titanic/introduction-to-ensembling-stacking-in-python/notebook

20.04

  • https://gab41.lab41.org/batch-normalization-what-the-hey-d480039a9e3b
  • https://github.com/nbgallery/nbgallery.github.io
  • https://github.com/yanyang729/656kagglequoraquestionpair
  • https://github.com/lodrice/LabelGAN
  • https://github.com/bathulas/kaggle-quora/blob/master/quora.ipynb
  • https://github.com/ArtistScript/Kaggle-Quora-/blob/master/kaggle/xgb.py
  • https://github.com/Mustufain/Quora--Detecting-Duplicate-Questions/blob/master/Quora_Features.py
  • https://github.com/codeheadshopon/Quora-Question-Pair-Classification/blob/master/SImpleLstmShort

19.04

  • https://tryolabs.com/blog/machine-learning-deep-learning-conferences/?N
  • https://gab41.lab41.org/batch-normalization-what-the-hey-d480039a9e3b
  • https://gab41.lab41.org/jupyter-notebook-sharing-is-caring-5ed4831d7f71
  • http://blog.smola.org/post/4110255196/real-simple-covariate-shift-correction

18.04

  • http://www.abigailsee.com/2017/04/16/taming-rnns-for-better-summarization.html
  • http://blog.datadive.net/selecting-good-features-part-i-univariate-selection/
  • http://blog.smola.org/post/4110255196/real-simple-covariate-shift-correction
  • http://www.mitpressjournals.org/doi/abs/10.1162/089976602753284446#.WPVs_VOGPdQ
  • http://scikit-learn-general.narkive.com/ShZKenFK/real-simple-covariate-shift-correction-using-logistic-regression
  • http://wan.poly.edu/KDD2012/docs/p168.pdf
  • http://www.ml.uni-saarland.de/Publications/Hein%20-%20Binary%20Classification%20under%20Sample%20Selection%20Bias(2008).pdf
  • http://www.gatsby.ucl.ac.uk/~gretton/papers/covariateShiftChapter.pdf
  • SVD http://econometricsense.blogspot.com/2011/11/singular-value-decomposition-and-text.html
  • Pearson vs Kendall http://www.statisticssolutions.com/correlation-pearson-kendall-spearman/
  • http://www.gatsby.ucl.ac.uk/~gretton/papers/covariateShiftChapter.pdf

17.04

  • df rolling http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.rolling.html
  • https://github.com/DingKe/qrnn
  • https://spacy.io/docs/usage/training-ner
  • https://www.tensorflow.org/versions/master/apidocs/python/tf/contrib/crf/viterbidecode

16.04

  • http://llcao.net/cu-deeplearning17/project/midterm_summarize.pdf
  • https://gist.github.com/stared/dfb4dfaf6d9a8501cd1cc8b8cb806d2e
  • http://www.orbifold.net/default/2016/11/25/some-feedforward-neural-networks-using-keras/

15.04

  • http://blog.nikhilgarg.me/2016/05/a-million-different-lives.html
  • http://www.aclweb.org/anthology/W16-16

14.04

  • http://blog.nikhilgarg.me/
  • https://www.slideshare.net/NikhilGarg51?utmcampaign=profiletracking&utmmedium=sssite&utm_source=ssslideview
  • https://qconsf.com/sf2016/system/files/presentation-slides/scalingqualityusingmachinelearning-qconsf2016.pdf
  • pydatalondon, May: https://pydata.org/london2017/schedule/
  • https://github.com/airalcorn2/RankNet

13.04

  • hamaru: https://arxiv.org/pdf/1704.03477.pdf
  • https://github.com/maartenbreddels/ipyvolume
  • https://pydata.org/amsterdam2017/schedule/presentation/11/
  • https://github.com/godatadriven/risk-analysis
  • https://github.com/godatadriven/pydata-2017-dsp-tutorial

12.04

  • modern nlp: http://nbviewer.jupyter.org/github/skipgram/modern-nlp-in-python/blob/master/executable/ModernNLPin_Python.ipynb
  • maxout: http://www-etud.iro.umontreal.ca/~goodfeli/maxout.html
  • https://jamesmccaffrey.wordpress.com/2013/11/05/why-you-should-use-cross-entropy-error-instead-of-classification-error-or-mean-squared-error-for-neural-network-classifier-training/
  • https://github.com/dmesquita/understandingtensorflownn
  • https://pub.uni-bielefeld.de/data
  • gated non consecutive cnn: https://arxiv.org/pdf/1512.05726.pdf
  • tf for baby: https://medium.freecodecamp.com/big-picture-machine-learning-classifying-text-with-neural-networks-and-tensorflow-d94036ac2274
  • acl 16 workshop: http://www.aclweb.org/anthology/W16-16

10.04

  • http://nbviewer.jupyter.org/github/skipgram/modern-nlp-in-python/blob/master/executable/ModernNLPin_Python.ipynb
  • https://github.com/rykov8/ssdkeras/blob/master/SSDtraining.ipynb
  • https://vkolachalama.blogspot.in/2016/05/keras-implementation-of-mlp-neural.html
  • best practice: https://arxiv.org/pdf/1704.01568.pdf
  • https://www.slideshare.net/khomenko1/from-data-science-to-production-deploy-scale-enjoy-pydata-amsterdam-mar-12-2016
  • https://github.com/gianlucahmd/loadsclustering/blob/master/loadsclustering.ipynb

08.04

  • ffm: http://www.csie.ntu.edu.tw/~r01922136/slides/ffm.pdf
  • https://medium.com/startup-grind/i-reverse-engineered-a-500m-artificial-intelligence-company-in-one-week-heres-the-full-story-d067cef99e1c
  • http://www.learnbymarketing.com/950/winning-a-kaggle-competition-analysis/
  • imbalance: https://silicon-valley-data-science.github.io/learning-from-imbalanced-classes/Gaussians.html
  • https://www.svds.com/learning-imbalanced-classes/
  • 3 idiots, ad prediction criteo: http://www.csie.ntu.edu.tw/~r01922136/kaggle-2014-criteo.pdf
  • https://docs.google.com/presentation/d/1bte84MNQu3LDq5WjNMP3ZBDsMfn0eKlnwBvvKFBWVFI/edit#slide=id.g20276450fa128
  • https://medium.com/startup-grind/i-reverse-engineered-a-500m-artificial-intelligence-company-in-one-week-heres-the-full-story-d067cef99e1c

07.04

  • https://gist.github.com/udibr
  • tf sequence tagging: https://guillaumegenthial.github.io/sequence-tagging-with-tensorflow.html
  • tweet2vec cluster: https://github.com/vendi12/tweet2vec_clustering
  • learning to generate review and discore sentiment: https://github.com/openai/generating-reviews-discovering-sentiment
  • https://aclweb.org/anthology/K15-1013
  • https://github.com/brmson/dataset-sts
  • https://drive.google.com/drive/folders/0B-btHzfJjPnobXZ0MndjSkxkRkk

06.04

  • http://pasky.or.cz/cp/poster-repl4nlp2016.pdf
  • https://www.quora.com/How-do-I-learn-deep-learning-in-2-months
  • non-linear transformation: https://swarbrickjones.wordpress.com/2017/03/28/cross-entropy-and-training-test-class-imbalance/#more-2486
  • homedepot: https://github.com/ChenglongChen/Kaggle_HomeDepot

05.04

  • https://github.com/kootenpv/tweetokenize
  • http://labs.septeni-technology.jp/
  • pointer LSTM: https://github.com/keon/pointer-networks
  • https://rare-technologies.com/text-summarization-in-python-extractive-vs-abstractive-techniques-revisited/
  • https://github.com/mattilyra/glove2h5

04.04

  • http://slides.com/smerity/quora-frontiers-of-memory-and-attention#/35
  • https://github.com/cesc-park/CRCN/blob/master/keras/examples/kaggleottonn.py
  • https://www.visme.co/make-information-beautiful/dona-wong-visualizing-financial-data/
  • http://web.stanford.edu/class/cs224n/reports.html
  • http://www.chioka.in/differences-between-l1-and-l2-as-loss-function-and-regularization/

03.04

  • https://nlp.stanford.edu/~socherr/pa4_ner.pdf
  • https://github.com/chokkan/crfsuite/blob/master/example/ner.py
  • https://www.reddit.com/r/MachineLearning/comments/3dz3fl/dlarchitecturesforentityrecognitionandother/

01.04

  • https://github.com/tuanavu/coursera-university-of-washington/tree/master/machinelearning/3classification

31.03

  • https://www.edge.org/response-detail/23691
  • https://arxiv.org/pdf/1408.0782.pdf

30.03

  • deepnl: https://github.com/attardi/deepnl
  • https://gist.github.com/jeremystan/c236000a4159f9d47c28784fa6693c45#file-initial_architecture-py
  • Relationship Modeling network: https://pbs.twimg.com/media/C7dvymYVQAAut9_.jpg:large
  • https://tech.instacart.com/deep-learning-with-emojis-not-math-660ba1ad6cdc
  • Rethink RNN: https://docs.google.com/document/d/1X9f-wst8QhrCCFTWiJIz6vq1qAOlpyYAUo_kaFf0J8M/edit
  • crfasrnn: https://github.com/torrvision/crfasrnn

29.03

  • silicon valley ds: https://github.com/silicon-valley-data-science/RNN-Tutorial
  • https://github.com/richliao/textClassifier
  • https://richliao.github.io/supervised/classification/2016/12/26/textclassifier-RNN/

28.03

  • https://www.dropbox.com/s/tohrsllcfy7rch4/SimpleQuestions_v2.tgz
  • https://github.com/sujitpal/dl-models-for-qa
  • http://allenai.org/data.html
  • https://www.nervanasys.com/building-skip-thought-vectors-document-understanding/

27.03

  • https://truyentran.github.io/talks/ai16-tute-part-I.pdf
  • https://github.com/truyentran
  • RE with LSTM in TF: https://github.com/thunlp/TensorFlow-NRE
  • http://www.exegetic.biz/blog/2015/12/making-sense-logarithmic-loss/
  • http://nghiaho.com
  • https://liusida.github.io/2016/10/31/translate-from-tf-2-keras/

26.03

  • https://github.com/pandas-dev/pandas/blob/master/doc/cheatsheet/PandasCheatSheet.pdf
  • https://github.com/stanfordnlp/cs224n-winter17-notes/

25.03

  • https://github.com/seatgeek/fuzzywuzzy
  • misunderstanding of P: http://tuanvannguyen.blogspot.com/2017/03/10-hieu-lam-ve-tri-so-p-trong-khoa-hoc.html

23.03

  • http://cs224d.stanford.edu/reports_2016.html
  • https://github.com/hycis/bidirectional_RNN
  • https://github.com/MLWave/Kaggle-Ensemble-Guide
  • https://github.com/stanfordnlp/cs224n-winter17-notes
  • https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/CIKM14tutorialHeGaoDeng.pdf

21.03

  • DSSM: https://www.microsoft.com/en-us/research/project/dssm/?from=http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fprojects%2Fdssm%2F
  • MS NLP https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/CIKM14tutorialHeGaoDeng.pdf

20.03

  • https://github.com/kweonwooj/kagglesantanderproduct_recommendation
  • bn in application: https://github.com/yskmt/kaggle-otto/tree/master/keras
  • https://github.com/WenchenLi/kaggle/blob/master/otto/keras/kaggleottonn.py
  • https://github.com/ducha-aiki/caffenet-benchmark/blob/master/batchnorm.md

I haven't gone back to check what they are suggesting in their original paper, but I can guarantee that recent code written by Christian applies relu before BN. It is still occasionally a topic of debate, though.

17.03

  • install keras on gpu: please use --no-deps flags: https://github.com/fchollet/keras/wiki/Keras-2.0-release-notes
  • quora again: https://github.com/abhishekkrthakur/isthataduplicatequora_question
  • clickbait: https://github.com/abhishekkrthakur/clickbaits_revisited

16.03

  • http://www.cs.cornell.edu/courses/cs474/2005fa/Handouts/advanced-qa.pdf
  • https://github.com/fchollet/keras/wiki/Keras-2.0-release-notes
  • https://www.slideshare.net/anirudhkoul/squeezing-deep-learning-into-mobile-phones
  • https://automatedinsights.com/blog/the-python-nlp-ccosystem-a-short-and-very-opinionated-guide
  • https://metamind.io/research/learning-when-to-skim-and-when-to-read

15.03

  • https://github.com/rguthrie3/DeepLearningForNLPInPytorch/blob/master/Deep%20Learning%20for%20Natural%20Language%20Processing%20with%20Pytorch.ipynb
  • http://pytorch.org/#pip-install-pytorch
  • tweet calendar: http://ec2-54-170-89-29.eu-west-1.compute.amazonaws.com:8000//month/201703/
  • https://www.cs.cornell.edu/courses/cs6740/2010sp/
  • hello keras 2: I love it, https://blog.keras.io/
  • how to annotate: https://docs.google.com/document/d/1caUD8h-M117pKlds8rRP8jzQ0GN41NzD9UYvog4NyuQ/edit#heading=h.ggo1tu2159da
  • social health mining: http://www.cs.jhu.edu/~mdredze/code.php
  • http://www.sciencedirect.com/science/article/pii/S088523081630002X

14.03

  • seq2seq on tf(general) https://github.com/google/seq2seq
  • sentencepiece tokenizer https://github.com/google/sentencepiece

13.03

  • visual search in es: https://github.com/tuan3w/visual_search
  • 9-15% twitter active users are bot: https://arxiv.org/pdf/1703.03107.pdf
  • http://www.springer.com/gp/book/9783319472409
  • https://arxiv.org/pdf/1602.04427.pdf
  • Socher at LXMS: http://lxmls.it.pt/2014/socher-lxmls.pdf
  • use vgg to classify cat/dog: https://gist.github.com/embanner/6149bba89c174af3bfd69537b72bca74
  • https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

10.03

  • https://github.com/meereeum/lda2vec-tf
  • https://github.com/DiceTechJobs/ConceptualSearch

09.03

  • https://github.com/fastai/courses
  • https://www.slideshare.net/0xdata/arno-candel-aibythebay-030617
  • https://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=98336
  • https://medium.com/salesforce-engineering/salesforce-research-deep-learning-breakthroughs-d83c8b2ac4c3#.a9zswyhov

08.03

  • CMU RF and control course: https://katefvision.github.io/
  • https://www.slideshare.net/JasonKessler/turning-unstructured-content-into-kernels-of-ideas/52
  • norvig ngram: http://norvig.com/ngrams/

07.03

  • https://www.slideshare.net/JasonKessler/turning-unstructured-content-into-kernels-of-ideas/52
  • https://arxiv.org/pdf/1703.00565.pdf
  • https://jasonkessler.github.io/st-sim.html
  • Dr Bao H.T JAIST: http://www.jaist.ac.jp/~bao/VIASM-SML/Lecture/L1-ML%20overview.pdf
  • Khanh UMD: https://github.com/khanhptnk?tab=repositories

06.03

  • http://campuspress.yale.edu/yw355/deep_learning/
  • https://github.com/georgeiswang/Keras_Example
  • https://github.com/thomasj02/DeepLearningProjectWorkflow
  • https://tensorflow.github.io/serving/docker.html
  • Deep learning in NLP: http://campuspress.yale.edu/yw355/deep_learning/

05.03

  • fcholet: xception https://arxiv.org/pdf/1610.02357.pdf

04.03

  • https://github.com/jfsantos/TensorFlow-Book
  • https://github.com/jfsantos/keras-tutorial/blob/master/notebooks/5%20-%20Improving%20generalization%20with%20regularizers%20and%20constraints.ipynb

02.03

  • https://explosion.ai/blog/supervised-similarity-siamese-cnn
  • https://github.com/TeamHG-Memex/eli5/blob/master/README.rst
  • https://github.com/cemoody/topicsne?files=1
  • http://smerity.com/articles/2017/deepcoderandai_hype.html

01.03

  • http://smerity.com/articles/2017/deepcoderandai_hype.html
  • Twitter NER annotation: https://docs.google.com/document/d/12hI-2A3vATMWRdsKkzDPHu5oT74_tG0-PPQ7VN0IRaw/edit
  • WNUT 19, Japan, result: https://noisy-text.github.io/2016/pdf/WNUT19.pdf
  • pytorch vs keras/tf: https://www.reddit.com/r/MachineLearning/comments/5w3q74/dsopytorchvstensorflowwhatstheverdicton/
  • quora duplicate question detection: accuracy 1%(84.8) higher but 100x params than my model: https://github.com/abhishekkrthakur/isthataduplicatequora_question/blob/master/deepnet.py
  • https://github.com/chiphuyen/tf-stanford-tutorials?files=1
  • pretrained fasttext on wikipedia: https://github.com/facebookresearch/fastText

28.02

  • https://github.com/uclmr/emoji2vec/blob/master/TwitterClassification.ipynb
  • http://blog.outcome.io/pytorch-quick-start-classifying-an-image/
  • https://blog.mariusschulz.com/2014/06/03/why-using-in-regular-expressions-is-almost-never-what-you-actually-want

27.02

  • random walk -> graph -> node2vec: http://www.kdd.org/kdd2016/subtopic/view/node2vec-scalable-feature-learning-for-networks
  • URL2VEC: http://www.newfoundland.nl/wp/?p=112
  • 5 diseases of doing science: http://www.sciencedirect.com/science/article/pii/S104898431730070X
  • recommended book: https://www.amazon.com/Language-Processing-Perl-Prolog-Implementation/
  • Martin DL without PHD: https://github.com/martin-gorner/tensorflow-mnist-tutorial
  • https://codelabs.developers.google.com/codelabs/cloud-tensorflow-mnist/#0
  • https://docs.google.com/presentation/d/18MiZndRCOxB7g-TcCl2EZOElS5udVaCuxnGznLnmOlE/pub?slide=id.p
  • https://docs.google.com/presentation/d/1TVixw6ItiZ8igjp6U17tcgoFrLSaHWQmMOwjlgQY9co/pub?slide=id.p

26.02

  • https://medium.com/zendesk-engineering/how-zendesk-serves-tensorflow-models-in-production-751ee22f0f4b#.diz6kjaus
  • https://github.com/gkamradt/Lessons-Learned-Data-Science-Interviews/blob/master/Lessons%20Learned%20-%20Data%20Science%20Interviews.pdf

25.02

  • gensim 1.0: https://rare-technologies.com/gensim-switches-to-semantic-versioning/
  • https://www.slideshare.net/AhmadQamar3/using-deep-neural-networks-for-fashion-applications

24.02

  • how to init uniform (-b,b), summerschool of marek http://www.marekrei.com/blog/26-things-i-learned-in-the-deep-learning-summer-school/
  • Beam preprocessing: https://research.googleblog.com/2017/02/preprocessing-for-machine-learning-with.html
  • https://github.com/offbit/char-models/blob/master/doc-rnn2.py

23.02

  • http://affinelayer.com/pixsrv/
  • https://github.com/affinelayer/pix2pix-tensorflow#datasets-and-trained-models

22.02

  • https://github.com/offbit/char-models
  • https://offbit.github.io/how-to-read/
  • https://hackernoon.com/learning-ai-if-you-suck-at-math-p4-tensors-illustrated-with-cats-27f0002c9b32#.xqpspe69f
  • Beam search, NN tut from Quoc Le: https://cs.stanford.edu/~quocle/tutorial2.pdf
  • marek sequence tagger: https://github.com/marekrei/sequence-labeler

21.02

  • https://github.com/marekrei/sequence-labeler
  • markrei word + char attention: http://www.marekrei.com/blog/
  • datalab: https://github.com/googledatalab/
  • https://tw.pycon.org/2017/en-us/speaking/cfp/

20.02

  • https://github.com/ZhitingHu/logicnn
  • http://www.cs.cmu.edu/~zhitingh/data/acl16harnessing_slides.pdf
  • Lample: https://arxiv.org/pdf/1603.01360.pdf, https://github.com/glample/tagger
  • stacked NN LSTM: https://github.com/clab/stack-lstm-ner
  • https://github.com/napsternxg/DeepSequenceClassification/blob/master/model.py
  • chatbot: https://github.com/Marsan-Ma/tfchatbotseq2seq_antilm
  • keras crf https://github.com/pressrelations/keras/blob/98b2bb152b8d472150a3fc4f91396ce7f767bed9/examples/conll2000bilstm_crf.py
  • Ma Xue, CMU: best paper in ACL 2016, Germany https://github.com/XuezheMax/LasagneNLP
  • rnn+cnn+crf: https://arxiv.org/pdf/1603.01354.pdf
  • https://github.com/napsternxg/DeepSequenceClassification/blob/master/model.py
  • https://github.com/pth1993/vnspamsmsfiltering/blob/master/src/smsfiltering.py
  • https://data36.com/wp-content/uploads/2016/08/practicaldatadictionaryfinaldata36tomimesterpublished.pdf

19.02

  • scikit plot: https://github.com/reiinakano/scikit-plot

18.02

  • really cool Francis: https://github.com/frnsys/
  • ai notes: http://frnsys.com/ainotes/ainotes.pdf
  • brilliant wrong, ROC explanation: http://arogozhnikov.github.io/2015/10/05/roc-curve.html
  • yandex MLSchool in Londo: https://github.com/yandexdataschool/MLatImperial2017/

17.02

  • RNNs bag of applications: http://www.cs.toronto.edu/~urtasun/courses/CSC2541_Winter17/RNN.pdf
  • BiMPM https://arxiv.org/pdf/1702.03814.pdf
  • TextSum step by step: http://www.fastforwardlabs.com/luhn/
  • https://keon.io/rl/deep-q-learning-with-keras-and-gym/
  • https://medium.com/startup-grind/fueling-the-ai-gold-rush-7ae438505bc2#.ny8j80fl3
  • big 5 for DS: https://www.quora.com/How-do-you-judge-a-good-Data-scientist-with-just-5-questions
  • keon: https://github.com/keon/awesome-nlp
  • quid: word2vec + wikipedia: https://quid.com/feed/how-quid-improved-its-search-with-word2vec-and-wikipedia?utmcontent=42445351&utmmedium=social&utm_source=twitter
  • https://gist.github.com/asmeurer/5843625

16.02

  • market2vec: https://github.com/talolard/MarketVectors/blob/master/preparedata.ipynb
  • anything2vec: https://gist.github.com/nzw0301/333afc00bd508501268fa7bf40cafe4e
  • https://github.com/bradleypallen/keras-movielens-cf
  • https://www.slideshare.net/tkoshikawa?utmcampaign=profiletracking&utmmedium=sssite&utmsource=ssslideview
  • https://github.com/lipiji/App-DL
  • http://www.slideshare.net/LimZhiYuanZane/deep-learning-for-stock-prediction
  • https://github.com/kh-kim/stockmarketreinforcement_learning
  • stock2vec: https://github.com/kh-kim/stock2vec
  • deepwalk and word2vec: http://nadbordrozd.github.io/blog/2016/06/13/deepwalking-with-companies/
  • http://m-mitchell.com/NAACL-2016/SemEval/SemEval-2016.pdf
  • gandl: https://github.com/codekansas/gandlf
  • predictive on stock trading with sentiment: http://www.kdnuggets.com/2016/01/sentiment-analysis-predictive-analytics-trading-mistake.html
  • https://github.com/bradleypallen/keras-emoji-embeddings
  • https://github.com/bradleypallen/keras-quora-question-pairs/blob/master/README.md
  • DESM: https://www.microsoft.com/en-us/research/project/dual-embedding-space-model-desm/

15.02

  • sentiment analysis on Super Bowl: http://blog.aylien.com/sentiment-analysis-of-2-2-million-tweets-from-super-bowl-51/
  • spacy advanced text analysis: https://github.com/JonathanReeve/advanced-text-analysis-workshop-2017/blob/master/advanced-text-analysis.ipynb
  • pytorch: https://github.com/vinhkhuc/PyTorch-Mini-Tutorials
  • Quora engineering: https://engineering.quora.com/Semantic-Question-Matching-with-Deep-Learning
  • Space bag of nns: https://explosion.ai/blog/quora-deep-text-pair-classification
  • AUC 0.875 http://analyzecore.com/2017/02/08/twitter-sentiment-analysis-doc2vec/

14.02

  • event detection, extraction, triggering, mention: https://github.com/anoperson/jointEE-NN
  • batch renorm, due to sensitivity of batch size, initiation: https://arxiv.org/pdf/1702.03275.pdf
  • https://github.com/bmitra-msft/Demos/blob/master/notebooks/DESM.ipynb
  • nn for document ranking, mistra, ms cntk: https://github.com/bmitra-msft/NDRM
  • TFDevSummit: https://events.withgoogle.com/tensorflow-dev-summit/watch-the-videos/#content

13.02

  • Quora siamese: https://github.com/erogol/QuoraDQBaseline

12.02

  • http://www.slideshare.net/BhaskarMitra3/neural-text-embeddings-for-information-retrieval-wsdm-2017

10.02

  • kerlym: https://github.com/osh/kerlym
  • ICLR 17: https://amundtveit.com/2016/11/12/deep-learning-for-natural-language-processing-iclr-2017-discoveries/
  • https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb
  • all but of the top, pca on word2vec: https://arxiv.org/pdf/1702.01417.pdf
  • https://github.com/peter3125/sentence2vec

08.02

  • polarised term for document anonymisation: https://ddu1.github.io/Anonymization/
  • oxford course: https://github.com/oxford-cs-deepnlp-2017/lectures
  • tf fold: dynamic batching: https://research.googleblog.com/2017/02/announcing-tensorflow-fold-deep.html
  • https://www.insight-centre.org/sites/default/files/publications/newhorizons_online.pdf
  • https://github.com/chsasank/Traffic-Sign-Classification.keras/blob/master/Traffic%20Sign%20Classification.ipynb

07.02

  • openrefine: http://alexpetralia.com/posts/2015/12/14/the-problem-with-openrefine-clean-vs-messy-data
  • https://www.linkedin.com/pulse/keras-neural-networks-win-nvidia-titan-x-abhishek-thakur
  • deep q learning with keras and gym: https://keon.io/rl/deep-q-learning-with-keras-and-gym/
  • structured attention, Yoon Kim and Hoang Luong: https://github.com/harvardnlp/struct-attn
  • understanding DL requires rethinking generalisation: https://openreview.net/pdf?id=Sy8gdB9xx
  • GAN: https://github.com/osh/KerasGAN

06.02

  • http://lxmls.it.pt/2016/LxMLS2016.pdf
  • http://www.cs.umb.edu/~twang/file/tricksfromdl.pdf
  • https://svn.spraakdata.gu.se/repos/richard/pub/ml2016web/LT23062016examplesolution.pdf
  • https://svn.spraakdata.gu.se/repos/richard/pub/ml2015_web/l7.pdf
  • https://chsasank.github.io/spoken-language-understanding.html
  • ML4NLP: http://stp.lingfil.uu.se/~shaooyan/ml/nn.part2.pdf
  • Topic Modeling for extracting key words: http://bugra.github.io/work/notes/2017-02-05/topic-modeling-for-keyword-extraction/
  • Google Scraper: https://github.com/NikolaiT/GoogleScraper
  • Richard Johanson: https://svn.spraakdata.gu.se/repos/richard/pub/ml2015_web/l7.pdf
  • https://code.facebook.com/posts/457605107772545/under-the-hood-building-accessibility-tools-for-the-visually-impaired-on-facebook/
  • l2svm outperforms softmax: https://arxiv.org/pdf/1306.0239v4.pdf
  • xent vs hinge loss: http://cs231n.github.io/linear-classify/
  • https://github.com/nzw0301/keras-examples/blob/master/Skip-gram-with-NS.ipynb
  • model zoo pytorch: https://github.com/Cadene/tensorflow-model-zoo.torch
  • quora question pair: http://www.forbes.com/sites/quora/2017/01/30/data-at-quora-first-quora-dataset-release-question-pairs/#3d052ef475cb
  • Psychometric, CA and Trump: https://motherboard.vice.com/en_us/article/how-our-likes-helped-trump-win

27.1

  • https://github.com/bbelderbos/Codesnippets/tree/master/python
  • https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.htm

26.1

  • https://jaan.io/food2vec-augmented-cooking-machine-intelligence/
  • http://multithreaded.stitchfix.com/blog/2017/01/23/scaling-ds-at-sf-slides-from-ddtexas/
  • https://docs.docker.com/docker-for-mac/
  • https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.html#1
  • https://petewarden.com/

25.1

  • question duplication of Quora: https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs
  • stats for hackers code: https://github.com/croach/blog/tree/master/content
  • http://multithreaded.stitchfix.com/blog/2017/01/23/scaling-ds-at-sf-slides-from-ddtexas/

24.1

  • wordrank: http://deliprao.com/archives/124
  • code: https://bitbucket.org/shihaoji/wordrank
  • https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/WordRankwrapperquickstart.ipynb
  • https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/WordRankwrapperquickstart.ipynb
  • https://github.com/parulsethi/gensim/blob/wordrankwrapper/docs/notebooks/Wordrankcomparisons.ipynb
  • https://rare-technologies.com/wordrank-embedding-crowned-is-most-similar-to-king-not-word2vecs-canute/

23.1

  • nlp terms for novice: http://www.datasciencecentral.com/profiles/blogs/10-common-nlp-terms-explained-for-the-text-analysis-novice?utmcontent=buffer172af&utmmedium=social&utmsource=twitter.com&utmcampaign=buffer
  • blockchain: https://opendatascience.com/blog/what-is-the-blockchain-and-why-is-it-so-important/
  • nbgrader: https://github.com/jupyter/nbgrader
  • Adversarial ML: https://mascherari.press/introduction-to-adversarial-machine-learning/
  • 4 questions for G. Hinton: https://gigaom.com/2017/01/16/four-questions-for-geoff-hinton/
  • Debug in TF: https://wookayin.github.io/TensorflowKR-2016-talk-debugging/#1

20.1

  • demysify DS: https://docs.google.com/presentation/d/1N3KhPA--cQNjF9mD4Z4IzjKKFdwq1Ff6wQ6NN102uIk/edit#slide=id.g1be386a8a6021
  • ML on mobile: http://alexsosn.github.io/ml/2015/11/05/iOS-ML.html
  • https://www.bignerdranch.com/blog/use-tensorflow-and-bnns-to-add-machine-learning-to-your-mac-or-ios-app/
  • https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ios_examples
  • https://github.com/dennybritz/sentiment-analysis

19.1

  • Facebook again, pytorch: http://pytorch.org/
  • https://rare-technologies.com/new-gensim-feature-author-topic-modeling-lda-with-metadata/
  • pointer network: https://github.com/devsisters/pointer-network-tensorflow

18.1

  • http://blog.dennybritz.com/2017/01/17/engineering-is-the-bottleneck-in-deep-learning-research/
  • ml for practitioner: http://martin.zinkevich.org/rulesofml/rulesofml.pdf
  • write dl/nn from scratch: https://github.com/dmlc/minpy

17.1

  • improve headlines with salient words and seo score: http://www-personal.umich.edu/~tdszyman/misc/nlpmj16.pdf
  • text summarisation: http://www-personal.umich.edu/~tdszyman/misc/summarization15.pdf
  • word embedding over time: http://www-personal.umich.edu/~tdszyman/misc/InsightSIGNLP16.pdf
  • victor DS politech in France: https://github.com/Vict0rSch/datasciencepolytechnique
  • Thien NYU: http://www.cs.nyu.edu/~thien/
  • tonymooori: https://github.com/TonyMooori/studying
  • learning theory: https://web.stanford.edu/class/cs229t/notes.pdf
  • time series predictions: http://danielhnyk.cz/predicting-sequences-vectors-keras-using-rnn-lstm/

16.1

  • Edward Dustin Tran in TF already, so cool: https://arxiv.org/pdf/1701.03757v1.pdf
  • keras in tensorflow now on. @fchollet informed on Twitter.
  • squeezednet = tiny alexnet (5MB) https://github.com/rcmalli/keras-squeezenet
  • won $5k: https://medium.freecodecamp.com/recognizing-traffic-lights-with-deep-learning-23dae23287cc#.9yb31nsm4
  • https://github.com/karoldvl/paper-2015-esc-convnet/blob/master/Code/Results.ipynb

15.1

  • deep spell code: https://github.com/MajorTal/DeepSpell
  • draw svg in jupyter: https://github.com/uclmr/egal
  • sound classification with cnn: https://github.com/karoldvl/paper-2015-esc-convnet

14.1

  • https://medium.com/@majortal/deep-spelling-9ffef96a24f6
  • line bot + rnn + tf, vanhuyz: https://github.com/vanhuyz/line-sticker-bot
  • https://github.com/Vict0rSch/deep_learning/tree/master/keras
  • https://github.com/openai/pixel-cnn
  • AWS Lambda: http://blog.matthewdfuller.com/p/aws-lambda-pricing-calculator.html
  • deep text corrector: http://atpaino.com/2017/01/03/deep-text-correcter.html
  • https://github.com/dhwajraj/deep-text-classifier-mtl

13.1

  • convlstm: https://github.com/carlthome/tensorflow-convlstm-cell
  • GAN and RNN: https://www.reddit.com/r/MachineLearning/comments/40ldq6/generativeadversarialnetworksfortext/
  • generate sentences from continuous space: https://arxiv.org/pdf/1511.06349v2.pdf
  • How to train your Gen. model: Sampling, likelihood or adversary

12.1

  • https://www.raywenderlich.com/126063/react-native-tutorial
  • ml practitioners: https://news.ycombinator.com/item?id=10954508
  • spotify word2vec: https://douweosinga.com/projects/marconi?song1id=45yEy5WJywhJ3sDI28ajTm&song2id=
  • https://github.com/DOsinga/marconi/blob/master/train_model.py
  • True| Good | Kind | Useful | Relevant | Necessary https://www.quora.com/What-is-Triple-Filter-test-of-Socrates
  • https://www.youtube.com/watch?v=ifYfJdo27_k
  • student note: https://adeshpande3.github.io/adeshpande3.github.io/Deep-Learning-Research-Review-Week-3-Natural-Language-Processing

11.1

  • ggplot2 in R: http://sharpsightlabs.com/blog/mapping-vc-investment/
  • TF 1.0, mature. https://opendatascience.com/blog/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
  • NN semantic encoder: https://github.com/pdasigi/neural-semantic-encoders/blob/master/nse.py
  • DL in NN, overview: https://arxiv.org/pdf/1404.7828v4.pdf
  • jurgen schmid: http://people.idsia.ch/~juergen/

10.1

  • GDG NL: http://www.slideshare.net/RokeshJankie/introducing-tensorflow-the-game-changer-in-building-intelligent-applications
  • https://github.com/ToferC/Twittergraphingpython
  • http://www.oujago.com/DL_more.html
  • thiago DS at Yahoo: https://tgmstat.wordpress.com/
  • deepstack playing poker: https://arxiv.org/pdf/1701.01724v1.pdf
  • silly DL: https://news.ycombinator.com/item?id=13353941
  • http://p.migdal.pl/2017/01/06/king-man-woman-queen-why.html
  • AE for new molecule: http://www.impactjournals.com/oncotarget/index.php?journal=oncotarget&page=article&op=view&path[]=14073&pubmed-linkout=1

9.1

  • xlingual embedding: https://levyomer.wordpress.com/2017/01/08/a-strong-baseline-for-learning-cross-lingual-word-embeddings-from-sentence-alignments/
  • greg notebooks: https://github.com/gjreda/gregreda.com/tree/master/content/notebooks
  • the periodic table of AI: http://ai.xprize.org/news/periodic-table-of-ai
  • the same table of DL: http://www.deeplearningpatterns.com/doku.php/overview
  • aylien text mining and analysis: Sebastien Ruder: https://arxiv.org/pdf/1609.02746v1.pdf
  • DS as a freelancer from Greg Yhat: http://www.gregreda.com/2017/01/07/freelance-data-science-experience/

7.1

  • how bayesian inference works: http://brohrer.github.io/howbayesianinference_works.html
  • best vis projects in 2016: http://flowingdata.com/2016/12/29/best-data-visualization-projects-of-2016/
  • https://flowingdata.com/2012/12/17/getting-started-with-charts-in-r/

5.1

  • allenai biattflow: https://github.com/allenai/bi-att-flow
  • fork guy: https://github.com/BinbinBian
  • ICRL 17, DCNN: https://arxiv.org/pdf/1611.01604v2.pdf
  • victor zhong: https://github.com/vzhong/posts-notebooks
  • BN, if you wann gaussian, zero mean: https://kratzert.github.io/2016/02/12/understanding-the-gradient-flow-through-the-batch-normalization-layer.html
  • statsnlp https://github.com/uclmr/stat-nlp-book
  • sota of qa: http://metamind.io/research/state-of-the-art-deep-learning-model-for-question-answering/

4.1

  • dynet: CMU neural networks in C++: https://github.com/clab
  • systran: https://arxiv.org/pdf/1610.05540v1.pdf
  • punctuation normalisation: http://www.statmt.org/wmt11/normalize-punctuation.perl
  • GAN in keras: https://github.com/osh/KerasGAN
  • reinforcement learning in keras and gym: https://github.com/osh/kerlym
  • ML 101 for DE: https://drive.google.com/drive/folders/0B3bb7xB2VOUBMW1LQjVYUlJNRFU

3.1

  • variational for text processing: https://github.com/carpedm20/variational-text-tensorflow
  • spotify CNN music classification: https://www.dropbox.com/s/22bqmco45179t7z/thesis-FINAL.pdf
  • kaggle winning solution for whale detection: https://github.com/benanne
  • https://github.com/zygmuntz?tab=repositories

2.1.17

  • overfitting in life: http://tuanvannguyen.blogspot.com/2016/12/over-fitting-va-y-nghia-thuc-te-trong.html
  • optimal stopping problem: https://plus.maths.org/content/solution-optimal-stopping-problem

31.12

  • visualisation NLP: http://www.aclweb.org/anthology/N16-1082

30.12

  • zero shot translation: https://techcrunch.com/2016/11/22/googles-ai-translation-tool-seems-to-have-invented-its-own-secret-internal-language/

29.12

  • Music Tagging, CRNN https://arxiv.org/pdf/1609.04243v3.pdf
  • Benmusic: http://www.bensound.com/
  • event detection: http://anthology.aclweb.org/C/C14/C14-1134.pdf

28.12

  • NIPs 2016, embedding projector: https://arxiv.org/pdf/1611.05469.pdf
  • stats learning: https://web.stanford.edu/class/cs229t/notes.pdf
  • http://www.normansoft.com/blog/index.html
  • Tf projector is really cool: https://github.com/normanheckscher/mnist-tensorboard-embeddings/blob/master/mnist_t-sne.py
  • Who to follow on Twitter in ML/DL: https://twitter.com/DLMLLoop/lists/deep-learning-loop/members
  • How to learn? BPTT https://medium.com/@karpathy/yes-you-should-understand-backprop-e2f06eab496b#.sunmvqmsx

27.12

  • deep learning with Torch: https://github.com/soumith/cvpr2015
  • T7: https://github.com/soumith/cvpr2015/blob/master/cvpr-torch.pdf
  • GPOD general purpose object detector: https://github.com/EvgenyNekrasov/gpod
  • mckinseys: http://www.forbes.com/sites/louiscolumbus/2016/12/18/mckinseys-2016-analytics-study-defines-the-future-machine-learning
  • gumbel add noise to sigmoid: https://github.com/yandexdataschool/gumbel_lstm
  • fastai wordembedding: https://github.com/fastai/courses/blob/master/deeplearning1/nbs/wordvectors.ipynb

26.12

  • spotify cnn: http://benanne.github.io/2014/08/05/spotify-cnns.html
  • Gated RNN https://arxiv.org/pdf/1612.08083v1.pdf
  • http://www.slideshare.net/SebastianRuder/nips-2016-highlights-sebastian-ruder
  • monolingal dataset WMT 2014: http://www.statmt.org/wmt14/translation-task.html
  • neural turing machine: https://github.com/shawntan/neural-turing-machines
  • yandex ml school HSE: https://github.com/yandexdataschool/HSE_deeplearning

24.12

  • Laurent Dinh: Density estimation https://docs.google.com/presentation/d/152NyIZYDRlYuml5DbBONchJYA7AAwlti5gTWW1eXlLM/
  • Swiftkey, LM: https://blog.swiftkey.com/swiftkey-debuts-worlds-first-smartphone-keyboard-powered-by-neural-networks/
  • porting Theano to TF: https://medium.com/@sentimentron/faceoff-theano-vs-tensorflow-e25648c31800
  • tractica: DL for retailer: https://www.tractica.com/automation-robotics/leveraging-deep-learning-to-improve-the-retail-experience/
  • Effective Size: is Singaporean better in math than Vietnamese? if ES = 0.3, the overlap is near 90%, nothing to say in this Pisa's ranking.
  • dracula: twitter POS utilised GATE: https://github.com/Sentimentron/Dracula/
  • Business process with LSTM: https://arxiv.org/pdf/1612.02130v1.pdf

23.12

  • https://bigdatauniversity.com/courses/deep-learning-tensorflow/

22.12

  • https://quid.com/feed/how-quid-uses-deep-learning-with-small-data
  • dl for coders: http://course.fast.ai/, notebooks here: https://github.com/fastai/courses
  • encoder-decoder RNN: http://www.slideshare.net/ssuser77b8c6/reducing-the-dimensionality-of-data-with-neural-networks
  • https://trello.com/b/rbpEfMld/data-science
  • http://tuanvannguyen.blogspot.com/2016/12/yeu-to-nao-anh-huong-en-iem-pisa-2015.html

21.12

  • https://github.com/napsternxg/TwitterNER
  • news arxiv: https://news.google.com/newspapers?hl=en#F
  • https://github.com/skillachie/binaryNLP
  • https://github.com/skillachie/nlpArea51/blob/master/FinancialNewsText_Classification.ipynb
  • http://www.kdnuggets.com/2016/12/machine-learning-artificial-intelligence-main-developments-2016-key-trends-2017.html

20.12

  • http://opennmt.net
  • neural relation extraction https://www.aclweb.org/anthology/P/P16/P16-1200.pdf
  • claim classification: https://github.com/UKPLab/coling2016-claim-classification
  • https://www.ukp.tu-darmstadt.de/fileadmin/userupload/GroupUKP/publikationen/2016/2016COLINGCG.pdf

19.12

  • fasttext.zip https://arxiv.org/abs/1612.03651
  • bi sequence classification: same SNLI, event detection: https://pdfs.semanticscholar.org/6f42/cb23262066b4034aba99bf674783ed6cac8b.pdf
  • large scale contextual LSTM and NLP task: https://arxiv.org/pdf/1602.06291.pdf
  • main advances in ML 2016, Xavier at Quora: https://www.quora.com/What-were-the-main-advances-in-machine-learning-artificial-intelligence-in-2016?

17.12

  • https://github.com/jwkvam/bowtie

16.12

  • tensorflow book with code: https://github.com/BinRoot/TensorFlow-Book
  • trading with ML (Georgia university): https://www.udacity.com/course/machine-learning-for-trading--ud501

15.12

  • deepbach: https://github.com/SonyCSL-Paris/DeepBach
  • https://www.technologyreview.com/s/603137/deep-learning-machine-listens-to-bach-then-writes-its-own-music-in-the-same-style/
  • http://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html?_r=0
  • http://www.asimovinstitute.org/analyzing-deep-learning-tools-music/

14.12

  • spacy vs nltk: https://gist.github.com/rschroll/61b20c41e984a963df2870cfc9e628ed
  • psychometrics, precision marketing, privacy no longer: http://www.michalkosinski.com/
  • 300+ ML projects from Stanford: http://cs229.stanford.edu/PosterSessionProgram.pdf
  • NIPs 2016 codes: https://www.reddit.com/r/MachineLearning/comments/5hwqeb/projectallcodeimplementationsfornips2016/
  • Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences: https://github.com/dannyneil/public_plstm

13.12

  • NIPs summary: http://beamandrew.github.io/deeplearning/2016/12/12/nips-2016.html
  • how to choose batch size: https://github.com/karpathy/char-rnn, https://svail.github.io/rnn_perf/, http://axon.cs.byu.edu/papers/Wilson.nn03.batch.pdf
  • https://github.com/lmthang/thesis

12.12

  • Relation classification (RC) via data augmentation: https://arxiv.org/abs/1601.03651
  • broader twitter NER: http://www.slideshare.net/leonderczynski/broad-twitter-corpus-a-diverse-named-entity-recognition-resource
  • sequence classification such as NER, POS: https://github.com/napsternxg/DeepSequenceClassification
  • arctic captions: https://github.com/kelvinxu/arctic-captions/blob/master/alpha_visualization.ipynb
  • COLING 2016 from 13 to 16 Dec, Japan: https://github.com/napsternxg/TwitterNER, http://coling2016.anlp.jp/

11.12

  • SRL and RC: https://github.com/jiangfeng1124/emnlp14-semi, http://ir.hit.edu.cn/~jguo/papers/coling2016-mtlsrc.pdf
  • https://blog.insightdatascience.com/nips-2016-day-3-highlights-robots-that-know-cars-that-see-and-more-1ec958896791
  • http://www.newsreader-project.eu/files/2012/12/NWR-D5-2-1.pdf
  • http://nlesc.github.io/UncertaintyVisualization/
  • http://ixa2.si.ehu.es/nrdemo/demo.php
  • http://ir.hit.edu.cn/~jguo/papers/coling2016-mtlsrc.pdf

9.12

  • if then learning: https://papers.nips.cc/paper/6284-latent-attention-for-if-then-program-synthesis.pdf
  • reinforcement learning: https://github.com/DanielTakeshi
  • NIPS 2016: https://github.com/mphuget/NIPS2016
  • https://github.com/zelandiya/KiwiPyCon-NLP-tutorial
  • http://www.wrangleconf.com/apac.html
  • http://cs231n.github.io/aws-tutorial/
  • clickbait F1 98, AUC 99, too good too be true: https://arxiv.org/pdf/1612.01340v1.pdf
  • https://arxiv.org/abs/1606.04474
  • https://github.com/deepmind/learning-to-learn

8.12

  • hackermath: https://github.com/amitkaps/hackermath/blob/master/talk.pdf
  • tensorboard: https://www.tensorflow.org/versions/master/howtos/embeddingviz/index.html
  • embedding projector: http://projector.tensorflow.org/
  • dl4nlp at ukplab, Germany: https://github.com/UKPLab/deeplearning4nlp-tutorial/tree/master/2016-11_Seminar
  • Filter bubble vs Info cascading, Eli Pariser: https://www.ted.com/talks/elipariserbewareonlinefilter_bubbles

7.12

  • tidy data in pandas: http://www.jeannicholashould.com/tidy-data-in-python.html
  • graph db: https://blog.grakn.ai/adding-semantics-to-graph-databases-with-mindmapsdb-part-1-82022bbb3b1c
  • https://github.com/mikonapoli
  • reinforcement learninghttp, open ai://people.eecs.berkeley.edu/~pabbeel/nips-tutorial-policy-optimization-Schulman-Abbeel.pdf
  • meal description and food tagging: https://pdfs.semanticscholar.org/5f55/c5535e80d3e5ed7f1f0b89531e32725faff5.pdf

6.12

  • rationale cnn [keras] https://github.com/bwallace/rationale-CNN
  • churn analysis, f1 75%, lr, svm hinge: http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9849/9527
  • thanapon noraset: https://northanapon.github.io/read/
  • https://github.com/NorThanapon/adaptive_lm
  • train general AI: https://openai.com/blog/universe/
  • NIPS 2016 https://nips.cc/Conferences/2016/Schedule
  • full ds notebook: https://github.com/donnemartin/data-science-ipython-notebooks
  • Quoc Le, tut2: Autoencoder, CNN, RNN: http://ai.stanford.edu/~quocle/tutorial2.pdf
  • Quoc Le, tut1: nonlinear classifier and backprop: http://ai.stanford.edu/~quocle/tutorial1.pdf
  • Quoc Le, ex1: http://ai.stanford.edu/~quocle/exercise1.py
  • https://alexanderdyakonov.wordpress.com/2016/12/04/сундуки-и-монеты/#more-4401

5.12

  • semantic role labelings: https://blog.acolyer.org/2016/07/05/end-to-end-learning-of-semantic-role-labeling-using-recurrent-neural-networks/
  • ml yearning: https://gallery.mailchimp.com/dc3a7ef4d750c0abfc19202a3/files/MachineLearningYearningV0.501.pdf
  • stock embedding:https://medium.com/@TalPerry/deep-learning-the-stock-market-df853d139e02#.9q1d9hnai
  • fast weights: https://github.com/ajarai

2.12

  • https://github.com/cgpotts/cs224u

1.12

  • https://gist.github.com/honnibal
  • siamese lstm: https://github.com/aditya1503/Siamese-LSTM
  • accuracy of lunar chinese calendar to predict baby sex http://onlinelibrary.wiley.com/doi/10.1111/j.1365-3016.2010.01129.x/abstract;
  • customized keras lambda: https://gist.github.com/keunwoochoi

30.11

  • rnn tricks: http://www.slideshare.net/indicods/general-sequence-learning-with-recurrent-neural-networks-for-next-ml
  • data mining in action: Moscow, Russia: https://github.com/vkantor/MIPTDataMiningInAction_2016
  • hypo testing, birthday effect: http://www.slideshare.net/SergeyIvanov105/birthday-effect-67829860
  • LUI: linguistic UI https://medium.com/swlh/a-natural-language-user-interface-is-just-a-user-interface-4a6d898e9721
  • fake news is 80% accuracy better: http://www.mallikarjunan.com/verytas/how-good-are-you-at-recognizing-satire-quiz
  • nampi, spain 2017
  • decode thought vector: http://gabgoh.github.io/ThoughtVectors/
  • unstrained fmin: https://github.com/benfred/fmin
  • neural programmer: https://github.com/tensorflow/models/tree/master/neural_programmer
  • https://www.tensorflow.org/versions/master/howtos/embeddingviz/index.html#tensorboard-embedding-visualization

29.11

  • https://github.com/nyu-dl/NLPDLLecture_Note
  • NYU DL for NLP https://docs.google.com/document/d/1YS5QRvqMJVs9n3sK5fFjuldY7_vh42C5uUfxUGgL-Gc/edit
  • http://tuanvannguyen.blogspot.com/2016/11/machine-learning-la-gi.html
  • http://sebastianruder.com/cross-lingual-embeddings/
  • https://docs.google.com/presentation/d/1O-Ics69y445aWuxQ_VW6SDvKT9BGl3ZXLLZDG9tUiUY/edit#slide=id.p

28.11

  • event detection and deep learning: http://www.cs.nyu.edu/~thien/
  • https://github.com/anoperson/NeuralNetworksForRE
  • ED EE and MD with RNN and CNN: http://www.aclweb.org/anthology/P/P15/P15-2060.pdf

27.11

  • http://www.slideshare.net/PyData/fang-xu-enriching-content-with-knowledge-base-by-search-keywords-and-wikidata
  • https://www.mediawiki.org/wiki/Wikidataqueryservice/User_Manual

26.11

  • slides from mlconf sf 2016:http://www.slideshare.net/SessionsEvents/anjuli-kannan-software-engineer-google-at-mlconf-sf-2016
  • http://www.slideshare.net/KenjiEsaki/kdd-2016-slide

25.11

  • vo duy tin: https://github.com/duytinvo
  • https://spacy.io/docs/usage/entity-recognition

24.11

  • chinese NLP: https://github.com/taozhijiang/chinese_nlp
  • not news: http://venturebeat.com/2016/11/23/twitter-cortex-team-loses-some-ai-researchers/
  • sentihood: http://annotate-neighborhood.com/download/download.html, https://arxiv.org/pdf/1610.03771v1.pdf

23.11

Multithread in Theano:

  • check your blas: https://raw.githubusercontent.com/Theano/Theano/master/theano/misc/check_blas.py
  • http://deeplearning.net/software/theano/tutorial/multi_cores.html?highlight=multi%20co
  • https://github.com/Theano/Theano/issues/3239
  • set OMPNUMTHREADS=4 inside the notebook with env: https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts/

Debug

  • torch vs theano vs tf: https://www.quora.com/Is-TensorFlow-better-than-other-leading-libraries-such-as-Torch-Theano
  • debug Deep Learning: https://gab41.lab41.org/some-tips-for-debugging-deep-learning-3f69e56ea134#.1ldbphlav
  • negative loss: https://github.com/fchollet/keras/issues/1917

  • CAP: Clustering Association Prediction, stas thinking https://www.researchgate.net/publication/310597778Scientificdiscoverythroughstatistics

22.11

  • stance detection: favour or against: http://isabelleaugenstein.github.io/papers/SemEval2016-Stance.pdf
  • Hugo from Twitter to Google Brain, Montreal: https://techcrunch.com/2016/11/21/google-opens-new-ai-lab-and-invests-3-4m-in-montreal-based-ai-research/?sr_share=facebook
  • train word2vec in gensim in good way: https://github.com/RaRe-Technologies/gensim/blob/develop/docs/notebooks/doc2vec-IMDB.ipynb

21.11

  • sparql in python: https://joernhees.de/blog/tag/install/
  • minhash: http://mccormickml.com/2015/06/12/minhash-tutorial-with-python-code/
  • beating the kaggle easy way: http://www.ke.tu-darmstadt.de/lehre/arbeiten/studien/2015/Dong_Ying.pdf

19.11

  • 10 takeaways writeup MLConf SF: https://tryolabs.com/blog/2016/11/18/10-main-takeaways-from-mlconf/
  • theano summer school: https://github.com/mila-udem/summerschool2015
  • gpu card for macbook pro: http://udibr.github.io/using-external-gtx-980-with-macbook-pro.html
  • transfer learning using pretrained vgg, resnet for your problem: https://github.com/dolaameng/transfer-learning-lab

18.11

  • wikidata sparql: https://docs.google.com/presentation/d/16HhxRH-kkxqxcyzepXT-dHrnE90yVPlfkPq3cM2UzFg/edit#slide=id.g18e33c9ee62134
  • unkify: https://github.com/cdg720/emnlp2016/blob/master/utils.py#L322
  • http://smerity.com/articles/2016/googlenmtarch.html

17.11

  • wikidata: http://www.slideshare.net/_Emw/an-ambitious-wikidata-tutorial
  • wptools: https://github.com/siznax/wptools/wiki
  • google translate: https://arxiv.org/pdf/1611.04558v1.pdf
  • https://arxiv.org/pdf/1611.05104v1.pdf
  • https://arxiv.org/pdf/1611.01587v2.pdf

16.11

  • dssm deep sem sim models: https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/wsdm2015.v3.pdf
  • twitter @ Singapore: http://www.straitstimes.com/singapore/twitter-eyes-local-talent-for-singapore-data-science-team
  • multiple tasks of NLP: https://arxiv.org/pdf/1611.01587v2.pdf
  • QUASI RNN: https://arxiv.org/pdf/1611.01576v1.pdf

15.11

  • regex learning: http://dlacombejr.github.io/2016/11/13/deep-learning-for-regex.html
  • recurrent + cnn for text classification: https://github.com/airalcorn2/Recurrent-Convolutional-Neural-Network-Text-Classifier
  • quiver: to view convnet layer https://github.com/jakebian/quiver
  • hera: to see training progress board: https://github.com/jakebian/hera
  • RAISR: Rapid and Accurate Image Super Resolution https://arxiv.org/pdf/1606.01299v3.pdf
  • why is machine learning hard: http://ai.stanford.edu/~zayd/why-is-machine-learning-hard.html

14.11

  • event ODSC West: https://www.odsc.com/california
  • MLconf SF 12 Nov, summary: https://github.com/adarsh0806/ODSCWest/blob/master/MLConf.md
  • Duy Do talk: https://speakerdeck.com/duydo/elasticsearch-for-data-engineers

13.11

  • barcampsaigon 2016: some good topics on Elastic Search (Duy Do), Big Data analytics (Trieu Nguyen)
  • Altair https://speakerdeck.com/jakevdp/visualization-in-python-with-altair

12.11

  • Applications to explore (most of them are keras based)
  • https://github.com/farizrahman4u/seq2seq
  • https://github.com/farizrahman4u/qlearning4k
  • https://github.com/matthiasplappert/keras-rl

  • http://ml4a.github.io/guides/

  • https://github.com/kylemcdonald/SmileCNN

  • https://github.com/jocicmarko/ultrasound-nerve-segmentation

  • https://github.com/abbypa/NNProject_DeepMask

  • https://github.com/awentzonline/keras-rtst

  • https://github.com/phreeza/keras-GAN

  • https://github.com/jacobgil/keras-dcgan

  • https://github.com/mokemokechicken/keras_npi

  • https://github.com/codekansas/keras-language-modeling

11.11

  • https://github.com/wiki-ai/revscoring
  • Visual OCR attention: https://github.com/da03/Attention-OCR
  • startup and DL: https://github.com/lipiji/App-DL
  • embed + encode + attend + predict: https://explosion.ai/blog/deep-learning-formula-nlp
  • HN: https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf

10.11

  • https://arxiv.org/pdf/1508.06615.pdf

9.11

  • ibm researcher, lda gib sampling, doc2vec: https://github.com/jhlau

8.11

  • quoc le, rnn with reinforcement learning: http://openreview.net/pdf?id=r1Ue8Hcxg

7.11

  • https://github.com/vinhkhuc/MemN2N-babi-python
  • similarity proximity: http://www.datasciencecentral.com/profiles/blogs/comparison-between-global-vs-local-normalization-of-tweets-and
  • pycon15, elastic search: https://github.com/erikrose/elasticsearch-tutorial

6.11

  • https://github.com/Keats/rodent

04.11

  • airbnb knowledge scale: https://medium.com/airbnb-engineering/scaling-knowledge-at-airbnb-875d73eff091#.5moos4eki
  • R notebooks: http://rmarkdown.rstudio.com/r_notebooks.html
  • dask: https://github.com/dask/dask
  • dask vs celery: http://matthewrocklin.com/blog/work/2016/09/13/dask-and-celery
  • dask in jupyperlab: https://learning.acm.org/webinarpdfs/ChristineDoigWebinarSlides.pdf

3.11

  • https://hbr.org/resources/pdfs/hbr-articles/2016/11/thestateofmachineintelligence.pdf
  • shallow learn: gensim + fasttext: https://github.com/giacbrd/ShallowLearn
  • nn for sa: http://www.emnlp2016.net/tutorials/zhang-vo-t4.pdf

2.11

  • mask bilstm: http://dirko.github.io/Bidirectional-LSTM

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.