Agent skill

funsloth-hfjobs

Training manager for Hugging Face Jobs - launch fine-tuning on HF cloud GPUs with optional WandB monitoring

View SKILL.md on GitHub Repository

Stars 5

Forks 0

Install this agent skill to your Project

npx add-skill https://github.com/chrisvoncsefalvay/funsloth/tree/main/skills/funsloth-hfjobs

SKILL.md

Hugging Face Jobs Training Manager

Run Unsloth training on Hugging Face Jobs (cloud GPU training).

Prerequisites

HF Authentication: huggingface-cli whoami (login if needed)
HF Jobs Access: Requires PRO subscription or org compute access
Training notebook/script: From funsloth-train

Workflow

1. Select Hardware

GPU	VRAM	Cost	Best For
A10G	24GB	~$1.50/hr	7-14B LoRA
A100 40GB	40GB	~$4/hr	14-34B
A100 80GB	80GB	~$6/hr	70B
H100	80GB	~$8/hr	Fastest

See references/HARDWARE_GUIDE.md for model-to-GPU mapping.

2. Convert Notebook to Script

HF Jobs requires PEP 723 script format:

python

# /// script
# requires-python = ">=3.10"
# dependencies = [
#     "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git",
#     "torch>=2.0",
#     "transformers>=4.45",
#     "trl>=0.12",
#     "peft>=0.13",
#     "datasets>=2.18",
# ]
# ///

Use scripts/train_sft.py as a template.

3. Optional: WandB Integration

Add to script:

python

import wandb
wandb.init(project="funsloth-training")
# Add report_to="wandb" in TrainingArguments

Set: export WANDB_API_KEY="your-key"

4. Estimate Costs

Use the cost estimator:

bash

python scripts/estimate_cost.py --tokens {total_tokens} --platform hfjobs

5. Launch Job

bash

# Create job config
cat > job_config.yaml << 'EOF'
compute:
  gpu: {gpu_type}
  gpu_count: 1
script: train_hfjobs.py
outputs:
  - /outputs/*
EOF

# Submit
huggingface-cli jobs create --config job_config.yaml

6. Monitor Progress

bash

huggingface-cli jobs status {job_id}
huggingface-cli jobs logs {job_id} --follow

WandB: https://wandb.ai/{username}/funsloth-training

7. Download Artifacts

python

from huggingface_hub import snapshot_download
snapshot_download(repo_id="{username}/funsloth-job", local_dir="./outputs")

8. Handoff

Offer funsloth-upload for Hub upload with model card.

Error Handling

Error	Resolution
No HF Jobs access	Get PRO subscription
OOM	Reduce batch size or upgrade GPU
Job timeout	Enable checkpointing
Script error	Check PEP 723 dependencies

Bundled Resources

scripts/train_sft.py - PEP 723 script template
scripts/estimate_cost.py - Cost estimation
references/PLATFORM_COMPARISON.md - HF Jobs vs alternatives
references/HARDWARE_GUIDE.md - VRAM requirements
references/TROUBLESHOOTING.md - Common issues

Maintainer

chrisvoncsefalvay Core maintainer

Source details

Full Name: chrisvoncsefalvay/funsloth
Branch: main
Path in repo: skills/funsloth-hfjobs

Featured Tools

Join Our Newsletter

Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.

Recommended Agent Skills

Expand your agent's capabilities with these related and highly-rated skills.

chrisvoncsefalvay/funsloth

funsloth-upload

Generate comprehensive model cards and upload fine-tuned models to Hugging Face Hub with professional documentation

5 0

Explore

chrisvoncsefalvay/funsloth

funsloth-train

Generate Unsloth training notebooks and scripts. Use when the user wants to create a training notebook, configure fine-tuning parameters, or set up SFT/DPO/GRPO training.

5 0

Explore

chrisvoncsefalvay/funsloth

funsloth-local

Training manager for local GPU training - validate CUDA, manage GPU selection, monitor progress, handle checkpoints

5 0

Explore

chrisvoncsefalvay/funsloth

funsloth-check

Validate datasets for Unsloth fine-tuning. Use when the user wants to check a dataset, analyze tokens, calculate Chinchilla optimality, or prepare data for training.

5 0

Explore

chrisvoncsefalvay/funsloth

funsloth-runpod

Training manager for RunPod GPU instances - configure pods, launch training, monitor progress, retrieve checkpoints

5 0

Explore

mattpocock/skills

edit-article

Edit and improve articles by restructuring sections, improving clarity, and tightening prose. Use when user wants to edit, revise, or improve an article draft.

111,310 9,758

Explore

Didn't find tool you were looking for?

Search AI Tools

Install this agent skill to your Project

SKILL.md

Hugging Face Jobs Training Manager

Prerequisites

Workflow

1. Select Hardware

2. Convert Notebook to Script

3. Optional: WandB Integration

4. Estimate Costs

5. Launch Job

6. Monitor Progress

7. Download Artifacts

8. Handoff

Error Handling

Bundled Resources

Recommended Agent Skills

funsloth-upload

funsloth-train

funsloth-local

funsloth-check

funsloth-runpod

edit-article