�g�b�v�y�[�W -> ��������Ƌ��� -> �I�[�v���f�[�^ -> Python �� seaborn, scikits.learn �̃f�[�^�Z�b�g
[�T�C�g�}�b�v��]  

Python �� seaborn, scikits.learn �̃f�[�^�Z�b�g

Python �� seaborn, scikits.learn �̃f�[�^�Z�b�g�� CSV �`���̃t�@�C���ɕۑ����Ă݂�D


�O����

Ubuntu �̏ꍇ

Windows �̏ꍇ


Python �� seaborn�̃f�[�^�Z�b�g

�� Python �v���O����

import pandas as pd
import seaborn 

iris = seaborn.load_dataset('iris')
tips = seaborn.load_dataset('tips')
planets = seaborn.load_dataset('planets')
gammas = seaborn.load_dataset('gammas')
titanic = seaborn.load_dataset('titanic')
anscombe = seaborn.load_dataset('anscombe')
exercise = seaborn.load_dataset('exercise')

iris.to_csv('iris.csv', encoding='utf-8', index_label='id')
tips.to_csv('tips.csv', encoding='utf-8', index_label='id')
planets.to_csv('planets.csv', encoding='utf-8', index_label='id')
gammas.to_csv('gammas.csv', encoding='utf-8', index_label='id')
titanic.to_csv('titanic.csv', encoding='utf-8', index_label='id')
anscombe.to_csv('anscombe.csv', encoding='utf-8', index_label='id')
exercise.to_csv('exercise.csv', encoding='utf-8', index_label='id')

iris.head()
tips.head()
planets.head()
gammas.head()
titanic.head()
anscombe.head()
exercise.head()


Python �� scikits.learn �̃f�[�^�Z�b�g

scikits.learn.datasets �̃f�[�^�Z�b�g�� CSV �t�@�C���ɕۑ�.

�Q�l Web �y�[�W: http://scikit-learn.org/stable/datasets/

�� Python �v���O����

import pandas as pd
import scikits.learn.datasets

a = scikits.learn.datasets.load_diabetes()
diabetes = pd.DataFrame( pd.Series( map(list, a.data) ), columns=["data"] )
diabetes["target"] = pd.Series(a.target)
  
a = scikits.learn.datasets.load_digits()
digits = pd.DataFrame( pd.Series( map(list, a.data) ), columns=["data"] )
digits["target"] = pd.Series(a.target)

a = scikits.learn.datasets.load_iris()
iris = pd.DataFrame( a.data, columns=['sepal_length', 'sepal_width', 'petal_length', 'petal_width'] )
iris["species_number"] = pd.Series(a.target)
iris["species"] = pd.Series(a.target_names[a.target])

a = scikits.learn.datasets.load_linnerud()
linnerud = pd.DataFrame( np.hstack((a.data_exercise, a.data_physiological)), columns=['weight', 'waist', 'pulse', 'chins', 'situps', 'jumps'] )

diabetes.to_csv('diabetes.csv', encoding='utf-8', index_label='id')
digits.to_csv('digits.csv', encoding='utf-8', index_label='id')
iris.to_csv('iris.csv', encoding='utf-8', index_label='id')
linnerud.to_csv('linnerud.csv', encoding='utf-8', index_label='id')

digits �̉摜�f�[�^��\�����Ă݂� (Visualize the first image of the digits dataset):

import pylab as pl
pl.matshow( array( digits["data"][0] ).reshape(8, 8) )
import pylab as pl
pl.gray()
pl.matshow( array( digits["data"][0] ).reshape(8, 8) )