Decoding KAN : Kolmogorov-Arnold Network

2 min readMay 10, 2024

Introduction to Kolmogorov-Arnold Networks (KANs)

Kolmogorov-Arnold Networks (KANs) represent an intriguing approach in the field of neural networks, drawing from the mathematical foundation provided by the Kolmogorov-Arnold representation theorem. This theorem posits that any multivariate continuous function can be expressed as a superposition of continuous functions of a single variable.

Theoretical Background

The Kolmogorov-Arnold representation theorem states that any continuous function

Implementation of KANs

Let’s delve into a basic implementation of a KAN. The idea is to model the network based on the theorem’s structure. Here’s a simplified example using Python and a popular neural network library, such as TensorFlow or PyTorch. For demonstration, we’ll assume a basic scenario where our target function is f(x, y) = sin(x) + cos(y), which we’ll try to learn using our KAN structure.

Sample Code

Here’s a Python snippet implementing a very basic version of a KAN using PyTorch:

import torch
import torch.nn as nn
import numpy as np
# Define the univariate functions
class UniVariateFunction(nn.Module):
  def __init__(self, output_size):
    super(UniVariateFunction, self).__init__()
    self.linear = nn.Linear(1, output_size)
  def forward(self, x):
   x = self.linear(x)
   return torch.sin(x) # Using sin as an activation function
  # Define the KAN Model
class KAN(nn.Module):
  def __init__(self):
   super(KAN, self).__init__()
   self.phi = nn.ModuleList([UniVariateFunction(1) for _ in range(2)]) # phi functions for x and y
   self.Phi = nn.Linear(2, 1) # Phi function to combine outputs
  def forward(self, x):
   x1, x2 = x[:, 0], x[:, 1]
   x1 = self.phi[0](x1.view(-1, 1))
   x2 = self.phi[1](x2.view(-1, 1))
   out = torch.cat((x1, x2), dim=1)
   out = self.Phi(out)
   return out
# Generate sample data
x = torch.linspace(-np.pi, np.pi, 200)
y = torch.linspace(-np.pi, np.pi, 200)
X, Y = torch.meshgrid(x, y)
Z = torch.sin(X) + torch.cos(Y)
# Prepare inputs and model
inputs = torch.stack([X.flatten(), Y.flatten()], dim=1)
model = KAN()
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
# Training loop
for epoch in range(1000):
 optimizer.zero_grad()
 outputs = model(inputs)
 loss = criterion(outputs, Z.flatten())
 loss.backward()
 optimizer.step()
if epoch % 100 == 0:
 print(f'Epoch {epoch}, Loss: {loss.item()}')

Visual Explanation

Conclusion

Kolmogorov-Arnold Networks represent a novel way to structure neural networks that could potentially lead to more efficient learning algorithms, particularly for complex multivariate functions.

Decoding KAN : Kolmogorov-Arnold Network

Introduction to Kolmogorov-Arnold Networks (KANs)

Written by Shashi Tripathi

No responses yet