Neural Network Architecture SeriesElena Daehnhardt |
Image credit: Illustration created with Midjourney, prompt by the author.
Image prompt“An illustration representing cloud computing” |
Neural Network Architecture Series
This series explores how neural network architecture evolved from residual connections to modern experiments with dense links, transformers, and automated design. Each post builds on the previous one while still standing on its own.
Series Progress
1 of 9 posts published
All Posts in This Series
Part 0: Artificial Neural NetworksArtificial neural networks (ANNs) are the cornerstone of Deep Learning algorithms. The name and the architecture are adopted from the human brain's neural network. ANNs are designed to simulate human reasoning based on how neurons communicate. ANNs contain a set of artificial neurons connected. |
|
Part 1: Understanding Neural Network Architecture: From Basics to Residual ConnectionsComing SoonBefore diving into cutting-edge architectural innovations, understand the foundations: how neural networks are structured, why depth creates problems, and how residual connections solved them. This post is currently being written and will be published soon. |
|
Part 2: DeepSeek's mHC: Making Neural Networks Learn Better by Preserving IdentityComing SoonDeepSeek's mHC architecture solves a fundamental tension in neural networks: how to allow rich information flow between layers while maintaining training stability. The result is measurably better performance on reasoning tasks. This post is currently being written and will be published soon. |
|
Part 3: Dense Connections and Information Flow: From DenseNet to Modern VariantsComing SoonExplore dense connectivity patterns, why they improve gradient flow, and when dense connections outperform residual links. This post is currently being written and will be published soon. |
|
Part 4: Attention Mechanisms Meet Residual Connections: How Transformers Use Skip ConnectionsComing SoonA practical guide to residual connections in transformers: equations, Pre-LN vs Post-LN trade-offs, failure modes, and implementation details for stable training. This post is currently being written and will be published soon. |
|
Part 5: Neural Architecture Search: Automating the Discovery of Better ConnectionsComing SoonHow NAS explores connection patterns and what it reveals about architecture design. This post is currently being written and will be published soon. |
|
Part 6: Bottlenecks, Inverted Residuals, and Mobile ArchitecturesComing SoonHow mobile architectures adapt residual designs to fit edge devices. This post is currently being written and will be published soon. |
|
Part 7: The Future of Connections: From Manifold Constraints to Dynamic RoutingComing SoonA survey of emerging connection patterns and where architecture research may head next. This post is currently being written and will be published soon. |
|
Part 8: Transformers in Practice: From Architecture to a Chatbot ImplementationComing SoonA practical bridge from transformer architecture concepts to implementation: build and evaluate a chatbot workflow with TensorFlow and HuggingFace, with clear trade-offs and deployment guidance. This post is currently being written and will be published soon. |
|
Getting Started
New to this series? Start with Part 0: Artificial Neural Networks.
Each post builds on the previous one, so reading them in order is recommended. Each post also works on its own if you need to jump to a specific topic.