요즘 머신러닝을 공부하는 과정에서 학교에서 지원해주는 프로그램을 신청하였다.
간단히 PPT 노트 정리를 한 것이고 한국어로 변역하는게 오히려 더 어려워서 영어로 썼습니다.
들어가기에 앞서 머신러닝 어디에 이용되는지 짧게 살펴보자
- self driving Car(computer vision, reinforcement learning)
- object detection (computer vision)
- face and video generation
- AI assistants (natural language processing)
- Game playing (alphago)
- Time series analysis
- 자율주행차
- 객체 탐지
- 얼굴 탐지(?)
- 인공지능
- 게임(알파고)
- 시계열 ( 일정시간 간격으로 배치된 데이터들의 수열)
머신러닝이란?
A program or system that builds(trains/learns) a predictive model from input data. The system uses the learned model to make useful predictions on new data.
컴퓨터가 학습 할 수 있도록 하는 알고리즘과 기술을 개발하는 분야를 말한다.
**Algorithm automatically "learns" from data***
"Train" a model on lots and lots of data 많은 데이터를 교육시킨다.
- Start with poor predictions
- make little tweaks to imporve
- like child doing homework
infer predictions on new data 위 과정을 거친 후 새로운 데이터에 예상 결과값을 출력할 수 있다.
Supervised
- all labeled data
- classification
Semi-supervised
- comelabeled data. lots of unlabeled data
Unperviesd
- Discover parterns without labels
Reinforcement
- learn as you go, need an environment (continuous input of data)
Define problem -> analyze data -> model selection -> validate results -> Document&imporve
Define problem
- The base of our problem will be something to do with dogs and cats.
- we want to distinguish between cats and dogs through image classification.
- move on to gathering data
Analyze Data
- Gather data; you need to put together large amounts of data, in the form of images, tobular data, or time series data.
- inspect data; Take a look at your data, every good machine learning engineer/scientist/researcher knows the data that they are working with in and out.
- what types of patterns can the model learn?
- Preprocess data; data from the world comes in messy formats. Machine learning models need clean data
Model selection
- Labeled images of cats and dogs =supervised learning
- deep learning will be very effective for this problem
- convolutional neural networks popular and state-of-art-for image classification
Validate Results
- check model performance on new images
- Train data accuracy alone is misleading!
- can't use the images we trained on
- since our model has been trained, we don't update it. we instead just see what it predicts
- Inference: run model without updating
- run inference on the test/validation data
Document & Improve
- Essential to trach what models you've tried and the settings you have used, and how they did s oyou know how you can improve
- models often have design choices called hyperparameters. These are values that influence the performance of your model.
아직 PPT 분량이 좀 많이 남았지만 오늘은 여기까지 하고 시험기간이라 시험준비를 해야겠다
머신러닝 공부하는 공부 안 하는 학생이였습니다
'IT 프로그래밍 > Python 파이썬' 카테고리의 다른 글
Lambda, map(),filter(),reduce() 람다란? (0) | 2020.10.18 |
---|---|
Python Trapping Rain Water 풀이 + 해석 (0) | 2020.10.15 |
Line 기본 총정리 (0) | 2020.10.05 |
cv2.HoughLinesP(edges,rho,theta,threshold,np.array([]),min_line_length,max_line_gap) (0) | 2020.10.04 |
Gaussian 가우시안 필터 openCV (0) | 2020.10.03 |