Skip to content

Latest commit

 

History

History
324 lines (270 loc) · 11.1 KB

README.md

File metadata and controls

324 lines (270 loc) · 11.1 KB

awesome-ai-engineering

List of resources helping you become a better AI engineer.

What's an AI Engineer?

  • Microsoft's definition "Artificial intelligence (AI) engineers are responsible for developing, programming and training the complex networks of algorithms that make up AI so that they can function like a human brain. This role requires combined expertise in software development, programming, data science and data engineering"
  • Coursera's definition "Artificial intelligence engineers are individuals who use AI and machine learning techniques to develop applications and systems that can help organizations increase efficiency, cut costs, increase profits, and make better business decisions."
  • Tech Target "AI engineers develop, program and train the complex networks of algorithms that encompass AI so those algorithms can work like a human brain. AI engineers must be experts in software development, data science, data engineering and programming."
  • Swyx podcast (17 April 2024)
  • Scaler Blogs "AI engineers design, develop, and deploy intelligent systems using machine learning, deep learning, and NLP to solve complex problems and enable autonomous decision-making."

What's the difference between an AI Engineer and a Machine Learning Engineer?

  • UpWork "AI engineers work on a broader set of tasks that encompass various forms of machine intelligence, like neural networks, to develop AI models for specific applications. In contrast, ML engineers focus more on ML algorithms and models that can self-tune to better learn and make predictions from large data sets."

What's the difference between an AI Engineer and a Software Engineer?

  • IEEE ChatGPT's summary of that page "AI engineers blend traditional software engineering skills with a deep understanding of machine learning and artificial intelligence to develop systems that enhance decision-making and automation within organizations. They are proficient in AI technologies and statistical analysis, focusing on building and integrating AI models into applications. On the other hand, software engineers focus broadly on designing, implementing, and maintaining software systems, with a comprehensive grasp of the software development lifecycle, from requirement analysis to deployment and maintenance. The distinction is further marked by the AI engineer's need to navigate emerging AI technologies, whereas software engineers adhere to established engineering principles and practices across various platforms and technologies"

Practical Tools & Techniques

This section covers useful stuff you can use to become a better AI engineer.

LLM Platforms and APIs

LLM Platforms

  • ChatGPT
  • Claude.ai
  • Phind (dev focus, GPT4+own)
  • Microsoft Copilot (GPT4+own)
  • Perplexity.ai
  • You.com
  • groq.com

LLM APIs and Inference Services

GPU Marketplaces

free open weight playgrounds

Try out open source models instantly.

self-hosted Open weight inference

  • ollama (go/open source)
  • LocalAI (go/open source)
  • msty.app
  • Nitro.jan.ai
  • Paddler scaling / load balancing of llama.cpp inference

SaaS

  • fal.ai
  • lepton.ai
  • modal.com: on demand Serverless container +GPU execution runtime
  • Predibase: LLM fine-tuning and hosting
  • brev.dev:
  • Replicate.com: models-as-a service
  • Together.ai: Serverless LLM / multimodal inference
  • Lambda Labs: Manual rental of GPUs / clusters
  • Beam.cloud: Serverless generative AI fast standup
  • Runpod
  • Cloudflare Workers AI
  • Coreweave: autoscale GPU + Serverless (knative)
  • Mosaicml: (acquired by Databricks)
  • mixedbread.ai: retrieval as a service (search, reranking, embedding)
  • lamini.ai: LLM inference
  • Anyscale + rai.ai scaling
  • HF inference API
  • massedcompute.com
  • Salad.com
  • Openpipe.ai
  • Unsloth.ai
  • Crusoe.ai GPU rental
  • Akash
  • Groq: ultra fast LLM for selected models
  • BoltAI
  • Saturn Cloud
  • Fireworks.ai
  • Inferless.com
  • Banana.dev (defunct)
  • pipeline.ai
  • hyperstack.cloud
  • Alibaba Elastic GPU service
  • Cloudalize GPU Kubenetes Service
  • Tensordock.com
  • Fly GPU GPUs on demand
  • Jarvis Labs GPUs on demand
  • BentoML open source open weight inference with cloud option
  • bitbop GPU dev the cloud

Structured output

  • SGLang
  • outlines
  • Instructor
  • Marginalia

Prompt engineering

LLM Development and Optimization

LLM Testing and Evaluation

  • promptfoo
  • Ollama grid search
  • Uptrain
  • Google Cloud GCP AutoSxS
  • Paloma
  • LightEval
  • Bayesian Evaluation
  • Mozilla's experience
  • Ruler (long context evaluation)
  • OpenAI Simple Evals
  • Moonshot

Leaderboards / Evaluations

Observability

Pretraining

llm.c: Andrey Karparthy's GPT-2 from thr ground up in raw C

Human Input Methods

  • RLHF
  • DPO
  • TKO
  • LIPO
  • DORA
  • SPO

Architecture Innovations

Tokenizers

Fine-Tuning and Optimization

Task-Optimized LLMs and Context Extension

  • Predibase LORALand
  • RoPE
  • Ailibi
  • LongRoPE
  • Unsloth+RoPE
  • InfiniAttention: a pathway to ultra long context windows with manageable memory consumption

Infrastructure and Tools

Vector Stores / Information Retrieval

  • pinecone
  • weaviate
  • chroma (open source)
  • lancedb (open source)
  • postgresql + pgvector (open source)
  • sqlite + vss (open source)
  • faiss by meta
  • Vespa.ai + binary embeddings

Telemetry

Cloud Hosting

  • Blueocean / paperspace for GPU
  • AWS
  • GCP
  • Azure
  • Hetzner GPU
  • Cloudflare

Notebooks and Code Interpreters

  • Lightning Studio
  • Google Colab
  • ChatGPT
  • Julius.ai

Attention Mechanisms

  • FlashAttentionv2
  • HippoAttention
  • RingAttention
  • PagedAttention

Model Merging

  • Efficient Linear Model Merging for LLMs
  • Automerge
  • Sakana Evolutionary Model Merge

Optimizers and Autodifferentiation

Optimizers

  • Adam
  • AdamW
  • Prodigy
  • Schedule-free optimizers (April 2024)

Autodifferentiation Libraries

  • SymPy
  • torch.autograd
  • Autograd
  • tf.GradientTape
  • gomlx

Prompt Debugging

  • mitmproxy (via Show Me The Prompt)

Agents and Swarms

Analytics

Chat with Your Data/RAG

  • Weaviate Verba: RAG solution using Weaviate
  • Microsoft GitHub
  • AWS Bedrock embeddings, streamlit, langchain, pinecone, claude, etc.
  • AWS Serverless
  • GCP
  • Gemini for document processing
  • AWS knowledge bases for bedrock
  • FLARE dynamically replace low-probability tokens with RAG lookups
  • Embedchain

Guardrails and Safety

Protection

  • Llamaguard
  • Llamaguard with streaming
  • Guardrails for AWS Bedrock

Jailbreaks

Embeddings and Document Processing

Embeddings Services

  • Amazon Titan Embeddings
  • Huggingface
  • Nomic + ollama
  • Cohere multi-aspect embeddings
  • LLM2Vec

Document Extraction Services

  • Amazon Kendra

Embeddings Algorithms

  • Colbert
  • Binary quantization (BitNet)

Multi-Adapter Models

For hosting multiple fine-tunes at once

  • Punica

GPU Usage Optimization

  • Run.ai -- service for bare metal GPU cluster management now owned by Nvidia

Important Datasets

  • sst2 sentiment movie sentiment (HF)
  • 650,000 English books
  • Openwebtext
  • Fineweb

Synthetic Data Generation

  • generator9000

GPUs and Accelerators

  • Groq
  • Truffle-1

Data Curation

  • NeMo-Curator

ML Local Mini Clusters

  • Tinybox / tinygrad
  • WOPR (7 x 4090)

Data Labeling

Model Configuration Management

  • DVCorg
  • WandB Weave