<目次>
(1) Kaggleのデータセットをダウンロードする方法(API)をご紹介
(1-0) STEP0:(事前準備)Kaggleアカウント作成
(1-1) STEP1:(事前準備)パッケージのインストール
(1-2) STEP2:(事前準備)APIトークンの取得
(1-3) STEP3:サンプルプログラム(データセットダウンロード)
(1) Kaggleのデータセットをダウンロードする方法(API)をご紹介
本記事ではKaggleのデータセットをPythonプログラムからAPI経由でダウンロードする手順についてご紹介します。
(1-0) STEP0:(事前準備)Kaggleアカウント作成
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_111_1_kaggle_download_dataset.jpg)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_111_2_kaggle_download_dataset.jpg)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_111_3_kaggle_download_dataset.jpg)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_111_4_kaggle_download_dataset.jpg)
↓
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_111_5_kaggle_download_dataset.jpg)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_111_6_kaggle_download_dataset.jpg)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_111_7_kaggle_download_dataset.jpg)
(1-1) STEP1:(事前準備)パッケージのインストール
> pip install kaggle --user
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_121_1_kaggle_download_dataset.jpg)
Running setup.py install for kaggle ... done Successfully installed kaggle-1.5.12
(1-2) STEP2:(事前準備)APIトークンの取得
OSError: Could not find kaggle.json. Make sure it's located in C:\Users\Rainbow\.kaggle. Or use the environment method.
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_122_kaggle_download_dataset.jpg)
●STEP2-1:APIトークンの発行
(図123)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_123_1_kaggle_download_dataset.jpg)
↓
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_123_2_kaggle_download_dataset.jpg)
↓
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_123_3_kaggle_download_dataset.jpg)
●STEP2-2:kaggle.jsonの配備
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_124_kaggle_download_dataset.jpg)
(1-3) STEP3:サンプルプログラム(データセットダウンロード)
from kaggle.api.kaggle_api_extended import KaggleApi import zipfile api = KaggleApi() api.authenticate() output_path = './kaggle_download_dataset/' # kaggle.com/c/dogs-vs-catsからダウンロード # train.zip / test1.zipの2つのファイルがある # './'はカレントディレクトリの意味。 api.competition_download_file('sentiment-analysis-on-movie-reviews', 'train.tsv.zip', path=output_path) api.competition_download_file('sentiment-analysis-on-movie-reviews', 'test.tsv.zip', path=output_path) # zipファイルの解凍 with zipfile.ZipFile(output_path+'train.tsv.zip', 'r') as zipref: zipref.extractall(output_path) with zipfile.ZipFile(output_path+'test.tsv.zip', 'r') as zipref: zipref.extractall(output_path)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_131_kaggle_download_dataset.jpg)
(結果例)
Downloading train.tsv.zip to ./kaggle_download_dataset 100%|█████████████████████████████████████████████████████████████████████████████████████████| 1.28M/1.28M [00:01<00:00, 1.25MB/s] Downloading test.tsv.zip to ./kaggle_download_dataset 100%|███████████████████████████████████████████████████████████████████████████████████████████| 494k/494k [00:00<00:00, 5.65MB/s]
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_132_kaggle_download_dataset.jpg)
↓
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_133_kaggle_download_dataset.jpg)
# tsvファイルの読込み with open(output_path+'train.tsv', encoding='utf-8', newline='') as f: for cols in csv.reader(f, delimiter='\t'): print(cols)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_134_kaggle_download_dataset.jpg)
(1-4) エラー対処:HTTP 403エラーが出た時の対処方法について
●エラー
HTTP response body: b'{"code":403,"message":"Permission \\u0027competitions.downloadData\\u0027 was denied"}'
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_211_kaggle_download_dataset.jpg)
●原因
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_212_kaggle_download_dataset.jpg)
●対策
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_213_kaggle_download_dataset.jpg)
![](https://rainbow-engine.com/wp-content/uploads/2023/04/RP-IT0715_Kaggle_download_dataset/RP-IT0715_214_kaggle_download_dataset.jpg)