독학하는 학생의 정리노트

독학하며 정리노트를 올리는 학생 공부 블로그입니다.

IT 프로그래밍/Python 파이썬

Machine learning 머신러닝이란 - 1 (+영어)

공부 안 하는 학생 2020. 10. 14. 03:49

요즘 머신러닝을 공부하는 과정에서 학교에서 지원해주는 프로그램을 신청하였다.

간단히 PPT 노트 정리를 한 것이고 한국어로 변역하는게 오히려 더 어려워서 영어로 썼습니다.

 

들어가기에 앞서 머신러닝 어디에 이용되는지 짧게 살펴보자

  •  self driving Car(computer vision, reinforcement learning)
  • object detection (computer vision)
  • face and video generation
  • AI assistants (natural language processing)
  • Game playing (alphago)
  • Time series analysis

 

  • 자율주행차
  • 객체 탐지
  • 얼굴 탐지(?)
  • 인공지능
  • 게임(알파고)
  • 시계열 ( 일정시간 간격으로 배치된 데이터들의 수열)

머신러닝이란?

A program or system that builds(trains/learns) a predictive model from input data. The system uses the learned model to make useful predictions on new data.

컴퓨터가 학습 할 수 있도록 하는 알고리즘과 기술을 개발하는 분야를 말한다.

**Algorithm automatically "learns" from data***

"Train" a model on lots and lots of data 많은 데이터를 교육시킨다.

  • Start with poor predictions
  • make little tweaks to imporve
  • like child doing homework

infer predictions on new data 위 과정을 거친 후 새로운 데이터에 예상 결과값을 출력할 수 있다.

Supervised

  • all labeled data
  • classification

Semi-supervised

  • comelabeled data. lots of unlabeled data

Unperviesd

  • Discover parterns without labels

Reinforcement

  • learn as you go, need an environment (continuous input of data)

Define problem -> analyze data -> model selection -> validate results -> Document&imporve

Define problem

  • The base of our problem will be something to do with dogs and cats.
  • we want to distinguish between cats and dogs through image classification.
  • move on to gathering data

Analyze Data

  • Gather data; you need to put together large amounts of data, in the form of images, tobular data, or time series data.
  • inspect data; Take a look at your data, every good machine learning engineer/scientist/researcher knows the data that they are working with in and out.
  • what types of patterns can the model learn?
  • Preprocess data; data from the world comes in messy formats. Machine learning models need clean data

Model selection

  • Labeled images of cats and dogs =supervised learning
  • deep learning will be very effective for this problem
  • convolutional neural networks popular and state-of-art-for image classification

Validate Results

  • check model performance on new images
  • Train data accuracy alone is misleading!
  • can't use the images we trained on
  • since our model has been trained, we don't update it. we instead just see what it predicts
  • Inference: run model without updating
  • run inference on the test/validation data

Document & Improve

  • Essential to trach what models you've tried and the settings you have used, and how they did s oyou know how you can improve
  • models often have design choices called hyperparameters. These are values that influence the performance of your model.

 

아직 PPT 분량이 좀 많이 남았지만 오늘은 여기까지 하고 시험기간이라 시험준비를 해야겠다

머신러닝 공부하는 공부 안 하는 학생이였습니다