Input Layer
This is the number of features in your input data (e.g., for a 28x28 image, this would be 784).
Build your network architecture layer by layer to find the total trainable parameters.