cpp/html/conv2d_8h_source.html

 /*

     Beatmup image and signal processing library

     Copyright (C) 2020, lnstadrum


     This program is free software: you can redistribute it and/or modify

     it under the terms of the GNU General Public License as published by

     the Free Software Foundation, either version 3 of the License, or

     (at your option) any later version.


     This program is distributed in the hope that it will be useful,

     but WITHOUT ANY WARRANTY; without even the implied warranty of

     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

     GNU General Public License for more details.


     You should have received a copy of the GNU General Public License

     along with this program.  If not, see <http://www.gnu.org/licenses/>.

 */


 #pragma once


 #include "operation.h"

 #include "../gpu/texture_handler.h"

 #include <vector>

 #include <array>

 #include <map>


 namespace Beatmup {

     namespace NNets {


         /**

             2D convolution operation computed on GPU.

             Has 2 inputs: main and residual (detailed below), and a single output.

             Constraints:

                 - Input and output contain values in [0, 1] range sampled over 8 bits.

                 - Number of input channels is 3 (i.e., the input is an RGB image) or a multiple of 4.

                 - Number of output feature maps is a multiple of 4.

                 - For group convolutions, each group contains a multiple of 4 input channels and a multiple of 4 output

                   channels, or exactly 1 input and 1 output channel (i.e., depthwise).

                 - Kernels are of square shape.

                 - Strides are equal along X and Y.

                 - Dilations are equal to 1.

                 - If an image is given on input (3 input feature maps), only valid padding is supported.

                 - An activation function is always applied on output.


             Raspberry Pi-related constraints:

                 - Pi cannot sample more than 256 channels to compute a single output value. Actual practical limit is

                   yet lower: something about 128 channels for pointwise convolutions and less than 100 channels for

                   bigger kernels. When the limit is reached, Pi OpenGL driver reports an out of memory error (0x505).


             Features:

                 - Bias addition integrated.

                 - An optional residual input is available: a tensor of output shape added to the convolution result

                   before applying the activation function.


             Convolution filters and bias are searched in chunks. The chunk names consist of the operation name followed

             by Conv2D::FILTERS_CHUNK_SUFFIX and Conv2D::BIAS_CHUNK_SUFFIX respectively.

             The chunk contents is a single precision floating point arrays.

             The filter coefficients are taken in "OIHW" layout, i.e., there are 'O*I' contiguous packets of 'H*W'

             values each. "O" and "I" are output and input channel numbers, "H" and "W" are filter height and width.

         */

         class Conv2D :

             public AbstractOperation, protected SpatialFilteringMixin, protected ActivationFunctionMixin

         {

         private:

             const Size kernelSize;

             const int numOutputChannels;                    //!< number of output feature maps

             const int numGroups;                            //!< number of convolution groups

             const int stride;

             const Size::Padding padding;

             const bool useInputImage;                       //!< if `true`, input is the texture handler, not the view

             const bool isDepthwise;                         //!< if `true`, the convolution is depthwise, otherwise regular

             const bool useBias;                             //!< if `true`, the bias addition is enabled

             bool ready;


             Storage::View input, output;

             Storage::View residualInput;                    //!< optional tensor to be added to the output before activation

             GL::TextureHandler *inputImage;                 //!< input texture handler to be used instead input view

             std::vector<GL::RenderingProgram*> programs;    //!< pointers to GLSL program, one per quad of output channels

             std::vector<std::array<float, 4>> coeffs;       //!< model data to pass to uniform variables, if used

             std::vector<int> execOrder;                     //!< execution order of GLSL programs

             std::vector<Storage::View> groupViews;          //!< views per convolution group


             /**

                 Maps an (inputChannel, outputChannel, x, y) position to a linear coefficient index in the chunkfile.

             */

             inline int getIdx(int output, int input, int x, int y) const {

                 return output + numOutputChannels * (input + kernelSize[2] * (x + kernelSize[0] * y));

             }


             void prepare(GraphicPipeline& gpu, ChunkCollection& data, GL::ProgramBank& bank);

             void execute(TaskThread& thread, GraphicPipeline& gpu);

             int getInputPadding(int index = 0) const;

             void getSampledChannels(int index, int& min, int& max) const;


         public:

             static const char* FILTERS_CHUNK_SUFFIX;  //!< suffix added to the op name to get the filters chunk id in the model data

             static const char* BIAS_CHUNK_SUFFIX;     //!< suffix added to the op name to get the bias chunk id in the model data


             /**

                 Instantiates a 2D convolution operation.

                 \param[in] name                 Operation name

                 \param[in] kernelSize           Convolution kernel size

                 \param[in] numInputChannels     Number of input feature map channels (input depth)

                 \param[in] numOutputChannels    Number of output feature map channels (output depth)

                 \param[in] stride               Convolution stride

                 \param[in] padding              Padding policy

                 \param[in] useBias              If `true`, the bias addition is enabled. The bias vector is searched in the model data.

                 \param[in] numGroups            Number of convolution groups to get a group/depthwise convolution

                 \param[in] activation           Activation function applied to the operation output

             */

             Conv2D(

                 const std::string& name,

                 const int kernelSize,

                 const int numInputChannels,

                 const int numOutputChannels,

                 const int stride = 1,

                 const Size::Padding padding = Size::Padding::VALID,

                 const bool useBias = true,

                 const int numGroups = 1,

                 const ActivationFunction activation = ActivationFunction::DEFAULT

             );


             inline bool isBiasUsed() const { return useBias; }


             inline int getInputCount()  const { return 2; }

             inline int getOutputCount() const { return 1; }


             inline bool acceptsStorageInput(int index = 0) const { return (index == 0 && !useInputImage) || index == 1; }

             inline bool acceptsStorageOutput(int index = 0) const { return index == 0; }

             inline bool acceptsTextureInput(int index = 0) const { return index == 0 && useInputImage; }


             Size getOutputSize(int outputIndex = 0) const;


             inline Storage::View getOutput(int index = 0) { return output; }


             void setInput(Storage::View&& storage, int inputIndex = 0);

             void setInput(GL::TextureHandler& image, int inputIndex = 0);

             void setOutput(Storage::View&& storage, int outputIndex = 0);


             std::map<std::string, std::string> serialize() const;


             void disconnect();


             /**

                 \brief Connects a tensor to a residual input.

                 This input is optional. The tensor is added to the convolution result before the non-linear activation

                 is applied. Its size must match the output size.

                 \param[in] storage      A storage view containing the residual input tensor.

             */

             inline void setResidualInput(Storage::View&& storage) { setInput(std::move(storage), 1); }


             unsigned long countMultiplyAdds() const;

             unsigned long countTexelFetches() const;


             /**

                 Sets up deserialization of the operation.

             */

             static bool initDeserializer();

         };


         /**

             \internal

             Being declared here, this variable ensures Conv2D::initDeserializer() is called with inclusion of this header file.

         */

         static const bool CONV2D_OP_DESERIALIZABLE = Conv2D::initDeserializer();

     }

 }

Beatmup::ChunkCollection
A key-value pair set storing pieces of arbitrary data (chunks) under string keys.
Definition: chunkfile.h:36

Beatmup::GL::ProgramBank
Stores linked GLSL programs and their associated fragment shader codes.
Definition: program_bank.h:31

Beatmup::GL::TextureHandler
Definition: texture_handler.h:37

Beatmup::GraphicPipeline
Internal low-level GPU control API.
Definition: pipeline.h:33

Beatmup::NNets::AbstractOperation
Abstract neural net operation (layer).
Definition: operation.h:46

Beatmup::NNets::AbstractOperation::name
std::string name
Definition: operation.h:50

Beatmup::NNets::ActivationFunctionMixin
A mixin implementing activation functions in GLSL.
Definition: operation.h:414

Beatmup::NNets::Conv2D
2D convolution operation computed on GPU.
Definition: conv2d.h:64

Beatmup::NNets::Conv2D::isDepthwise
const bool isDepthwise
if true, the convolution is depthwise, otherwise regular
Definition: conv2d.h:72

Beatmup::NNets::Conv2D::coeffs
std::vector< std::array< float, 4 > > coeffs
model data to pass to uniform variables, if used
Definition: conv2d.h:80

Beatmup::NNets::Conv2D::getOutput
Storage::View getOutput(int index=0)
Returns a storage view bound to a specific operation output.
Definition: conv2d.h:135

Beatmup::NNets::Conv2D::BIAS_CHUNK_SUFFIX
static const char * BIAS_CHUNK_SUFFIX
suffix added to the op name to get the bias chunk id in the model data
Definition: conv2d.h:98

Beatmup::NNets::Conv2D::setOutput
void setOutput(Storage::View &&storage, int outputIndex=0)
Definition: conv2d.cpp:512

Beatmup::NNets::Conv2D::numGroups
const int numGroups
number of convolution groups
Definition: conv2d.h:68

Beatmup::NNets::Conv2D::programs
std::vector< GL::RenderingProgram * > programs
pointers to GLSL program, one per quad of output channels
Definition: conv2d.h:79

Beatmup::NNets::Conv2D::useBias
const bool useBias
if true, the bias addition is enabled
Definition: conv2d.h:73

Beatmup::NNets::Conv2D::execute
void execute(TaskThread &thread, GraphicPipeline &gpu)
Executes the operation.
Definition: conv2d.cpp:272

Beatmup::NNets::Conv2D::padding
const Size::Padding padding
Definition: conv2d.h:70

Beatmup::NNets::Conv2D::ready
bool ready
Definition: conv2d.h:74

Beatmup::NNets::Conv2D::getInputCount
int getInputCount() const
Returns number of operation inputs.
Definition: conv2d.h:126

Beatmup::NNets::Conv2D::stride
const int stride
Definition: conv2d.h:69

Beatmup::NNets::Conv2D::getOutputSize
Size getOutputSize(int outputIndex=0) const
Returns full size of a specific operation output.
Definition: conv2d.cpp:397

Beatmup::NNets::Conv2D::inputImage
GL::TextureHandler * inputImage
input texture handler to be used instead input view
Definition: conv2d.h:78

Beatmup::NNets::Conv2D::initDeserializer
static bool initDeserializer()
Sets up deserialization of the operation.

Beatmup::NNets::Conv2D::prepare
void prepare(GraphicPipeline &gpu, ChunkCollection &data, GL::ProgramBank &bank)
Compiles GLSL shaders.
Definition: conv2d.cpp:88

Beatmup::NNets::Conv2D::serialize
std::map< std::string, std::string > serialize() const
Returns a serialized representation of th operation;.
Definition: conv2d.cpp:415

Beatmup::NNets::Conv2D::FILTERS_CHUNK_SUFFIX
static const char * FILTERS_CHUNK_SUFFIX
suffix added to the op name to get the filters chunk id in the model data
Definition: conv2d.h:97

Beatmup::NNets::Conv2D::useInputImage
const bool useInputImage
if true, input is the texture handler, not the view
Definition: conv2d.h:71

Beatmup::NNets::Conv2D::output
Storage::View output
Definition: conv2d.h:76

Beatmup::NNets::Conv2D::acceptsTextureInput
bool acceptsTextureInput(int index=0) const
Returns true if the operation can take a GL::TextureHandler at a specific input.
Definition: conv2d.h:131

Beatmup::NNets::Conv2D::execOrder
std::vector< int > execOrder
execution order of GLSL programs
Definition: conv2d.h:81

Beatmup::NNets::Conv2D::getSampledChannels
void getSampledChannels(int index, int &min, int &max) const
Retrieves range of input features channels sampled at the same time for a specific input.
Definition: conv2d.cpp:382

Beatmup::NNets::Conv2D::getInputPadding
int getInputPadding(int index=0) const
Retrieves minimum required size of zero padding for a given input.
Definition: conv2d.cpp:377

Beatmup::NNets::Conv2D::setResidualInput
void setResidualInput(Storage::View &&storage)
Connects a tensor to a residual input.
Definition: conv2d.h:151

Beatmup::NNets::Conv2D::acceptsStorageInput
bool acceptsStorageInput(int index=0) const
Returns true if the operation can take a Storage::View at a specific input.
Definition: conv2d.h:129

Beatmup::NNets::Conv2D::isBiasUsed
bool isBiasUsed() const
Definition: conv2d.h:124

Beatmup::NNets::Conv2D::acceptsStorageOutput
bool acceptsStorageOutput(int index=0) const
Returns true if the operation can take a Storage::View at a specific output.
Definition: conv2d.h:130

Beatmup::NNets::Conv2D::numOutputChannels
const int numOutputChannels
number of output feature maps
Definition: conv2d.h:67

Beatmup::NNets::Conv2D::groupViews
std::vector< Storage::View > groupViews
views per convolution group
Definition: conv2d.h:82

Beatmup::NNets::Conv2D::input
Storage::View input
Definition: conv2d.h:76

Beatmup::NNets::Conv2D::setInput
void setInput(Storage::View &&storage, int inputIndex=0)
Definition: conv2d.cpp:483

Beatmup::NNets::Conv2D::countMultiplyAdds
unsigned long countMultiplyAdds() const
Counts (approximate) number of multiply-adds used by this operation.
Definition: conv2d.cpp:528

Beatmup::NNets::Conv2D::kernelSize
const Size kernelSize
Definition: conv2d.h:66

Beatmup::NNets::Conv2D::getOutputCount
int getOutputCount() const
Returns number of operation outputs.
Definition: conv2d.h:127

Beatmup::NNets::Conv2D::Conv2D
Conv2D(const std::string &name, const int kernelSize, const int numInputChannels, const int numOutputChannels, const int stride=1, const Size::Padding padding=Size::Padding::VALID, const bool useBias=true, const int numGroups=1, const ActivationFunction activation=ActivationFunction::DEFAULT)
Instantiates a 2D convolution operation.
Definition: conv2d.cpp:41

Beatmup::NNets::Conv2D::disconnect
void disconnect()
Assigns empty inputs and outputs.
Definition: conv2d.cpp:474

Beatmup::NNets::Conv2D::residualInput
Storage::View residualInput
optional tensor to be added to the output before activation
Definition: conv2d.h:77

Beatmup::NNets::Conv2D::countTexelFetches
unsigned long countTexelFetches() const
Counts (approximate) number of texels fetches.
Definition: conv2d.cpp:533

Beatmup::NNets::Conv2D::getIdx
int getIdx(int output, int input, int x, int y) const
Maps an (inputChannel, outputChannel, x, y) position to a linear coefficient index in the chunkfile.
Definition: conv2d.h:87

Beatmup::NNets::Size
Operation 3D input/output size.
Definition: storage.h:37

Beatmup::NNets::Size::Padding
Padding
Zero padding specification.
Definition: storage.h:45

Beatmup::NNets::Size::Padding::VALID
@ VALID
no zero padding

Beatmup::NNets::SpatialFilteringMixin
Generates GLSL fragment shader code sampling a local neighborhood around the current texture coordina...
Definition: operation.h:272

Beatmup::NNets::Storage::View
Maps a 3D tensor onto a storage.
Definition: storage.h:308

Beatmup::TaskThread
Thread executing tasks.
Definition: parallelism.h:154

Beatmup::NNets::CONV2D_OP_DESERIALIZABLE
static const bool CONV2D_OP_DESERIALIZABLE
Definition: conv2d.h:166

Beatmup::NNets::ActivationFunction
ActivationFunction
Activation function specification.
Definition: operation.h:401

Beatmup::NNets::ActivationFunction::DEFAULT
@ DEFAULT
default activation: 0..1 bounded ReLU (identity clipped to 0..1 range)

Beatmup
Definition: basic_types.h:22

std::min
CustomPoint< numeric > min(const CustomPoint< numeric > &a, const CustomPoint< numeric > &b)
Definition: geometry.h:724

std::max
CustomPoint< numeric > max(const CustomPoint< numeric > &a, const CustomPoint< numeric > &b)
Definition: geometry.h:728

operation.h

index
jlong jint index
Definition: wrapper_core.cpp:434

y
jobject jlong jint jint y
Definition: wrapper_core.cpp:253

x
jobject jlong jint x
Definition: wrapper_core.cpp:253