The library currently lacks a dedicated linear/dense layer, which is essential for:
- Text generation (LSTM hidden, vocabulary logits)
- Classification tasks (hidden, class probabilities)
- Custom output projections
Proposed Implementation:
pub struct LinearLayer {
weight: Array2<f64>, // (output_size, input_size)
bias: Array1<f64>, // (output_size,)
}
impl LinearLayer {
pub fn new(input_size: usize, output_size: usize) -> Self
pub fn forward(&self, input: &Array2<f64>) -> Array2<f64>
pub fn backward(&self, grad_output: &Array2<f64>) -> (LinearGradients, Array2<f64>)
}
Benefits:
- Enables standard text generation architectures
- Simplifies classification tasks
- Reduces boilerplate code for users