Custom Shader Effects

While Corona has an extensive list of built-in shader effects, there are times when you may need to create custom effects. This guide outlines how to create custom effects using custom shader code, structured in the same way that Corona's built-in shader effects are implemented.

GPU Rendering Pipeline
Programmable Effects
Creating Custom Effects
Vertex Kernels
Fragment Kernels
Custom Varying Variables
Effect Parameters
GLSL Conventions and Best Practices

Notes

Writing custom effects is an advanced developer feature. If you want to take advantage of this feature, this guide assumes that you are already familiar with and fluent in GLSL ES (OpenGL ES 2.0).
Custom effects are supported on iOS, Android, macOS desktop, and Win32 desktop. Windows Phone 8 does not support this capability because it only supports precompiled shaders.

GPU Rendering Pipeline

In a programmable graphics pipeline, the GPU is treated as a stream processor. Data flows through multiple processing units and each unit is capable of running a (shader) program.

In OpenGL-ES 2.0, data flows from (1) the application to (2) the vertex processor to (3) the fragment processor and finally to (4) the framebuffer/screen.

In Corona, rather than write complete shader programs, custom shader effects are exposed in the form of vertex and fragment kernels which allow you to create powerful programmable effects.

Programmable Effects

Corona allows you to extend its pipeline to create several types of custom programmable effects, organized based on the number of input textures:

Generators — Procedurally-generated effects which don't operate on any textures/images.
Filters — Effects which operate on a single texture/image (BitmapPaint).
Composites — Effects which operate on two textures/images, combined together as a CompositePaint.

Creating Custom Effects

To define a new effect, call graphics.defineEffect(), passing in a Lua table which defines the effect. In order for this table to be a valid effect definition, it must contain several properties:

category — The type of effect.
name — The name within a given category.
vertex and/or fragment — Defines where your shader code goes, as described in the Defining Kernels section below.

A complete and detailed description of all properties is available in the graphics.defineEffect() documentation.

Naming Effects

The name of an effect is determined by the following properties:

category — The type of effect.
group — The group of effect. If not provided, Corona will assume custom.
name — The name within a given category.

When you set an effect on a display object, you must provide a fully-qualified string by concatenating the above values and separating each by a . as follows:

local effectName = "[category].[group].[name]"

Defining Kernels

Corona packages snippets of shader code in the form of kernels. By structuring specific vertex and fragment processing tasks in kernels, the creation of custom effects is dramatically simplified.

Essentially, a kernel is shader code which the main shader program relies upon to handle specific processing tasks. Corona supports both vertex kernels and fragment kernels. You must specify at least one kernel type in your effect (or both). If a vertex/fragment kernel is not specified, Corona inserts the default vertex/fragment kernel respectively.

Vertex Kernels

Vertex kernels operate on a per-vertex basis, enabling you to modify vertex positions before they are used in the next stage of the pipeline. They must define the following function which accepts an incoming position and can modify that vertex position.

Corona's default vertex kernel simply returns the incoming position:

P_POSITION vec2 VertexKernel( P_POSITION vec2 position )
{
    return position;
}

Time

The following time uniforms can be accessed by the vertex kernel:

P_DEFAULT float CoronaTotalTime — Running time of the app in seconds.
P_DEFAULT float CoronaDeltaTime — Time in seconds since previous frame.

If you use these variables in your kernel's shader code, your kernel is implicitly time-dependent. In other words, your kernel will output different results and evolve as time progresses.

When using these variables, you need to tell Corona that your shader requires the GPU to re-render the scene, even if there are no other changes to the display objects in the scene. You can do this by setting the kernel.isTimeDependent property in your kernel definition as indicated below. Note that you should only set this if your shader code is truly time-dependent, since it effectively forces the GPU to re-render every frame.

kernel.isTimeDependent = true

Size

The following size uniforms can be accessed by the vertex kernel:

P_POSITION vec2 CoronaContentScale — The number of content pixels per screen pixels along the x and y axes. Content pixels refer to Corona's coordinate system and are determined by the content scaling settings for your project.
P_UV vec4 CoronaTexelSize — These values help you understand normalized texture pixels (texels) as they relate to actual pixels. This is useful because texture coordinates are normalized (0 to 1) and normally you only have information about proportion (the percentage of the width or height of the texture). Effectively, these values help you create effects based on actual screen/content pixel distances.

Value	Definition
`CoronaTexelSize.xy`	The number of texels per screen pixel along the x and y axes.
`CoronaTexelSize.zw`	The number of texels per content pixel along the x and y axes, initially the same as `CoronaTexelSize.xy`. This is useful in creating resolution-independent effects that account for the additional pixel density due to dynamic image selection. Essentially, when a retina/HD image is selected, these components are divided by `CoronaContentScale`.

Coordinate

P_UV vec2 CoronaTexCoord — The texture coordinate for the vertex.

Example

The following example causes the bottom edge of an image to wobble by a fixed amplitude:

local kernel = {}

-- "filter.custom.myWobble"
kernel.category = "filter"
kernel.name = "myWobble"

-- Shader code uses time environment variable CoronaTotalTime
kernel.isTimeDependent = true

kernel.vertex =
[[
P_POSITION vec2 VertexKernel( P_POSITION vec2 position )
{
    P_POSITION float amplitude = 10;
    position.y += sin( 3.0 * CoronaTotalTime + CoronaTexCoord.x ) * amplitude * CoronaTexCoord.y;

    return position;
}
]]

Fragment Kernels

Fragment kernels operate on a per-pixel basis, enabling you to modify each pixel (i.e. image processing) before it is drawn to the framebuffer. They must define the following function which accepts an incoming texture coordinate and returns a color vector, for example the pixel color to be used in the next stage of the pipeline.

Corona's default fragment kernel simply samples a single texture (CoronaSampler0) and, using CoronaColorScale(), modulates it by the display object's alpha/tint:

P_COLOR vec4 FragmentKernel( P_UV vec2 texCoord )
{
    P_COLOR texColor = texture2D( CoronaSampler0, texCoord );
    return CoronaColorScale( texColor );
}

Time

The same vertex kernel time uniforms can be accessed by the fragment kernel.

Size

The same vertex kernel size uniforms can be accessed by the fragment kernel.

Samplers

P_COLOR sampler2D CoronaSampler0 — The texture sampler for the first texture.
P_COLOR sampler2D CoronaSampler1 — The texture sampler for the second texture (requires a composite paint).

Alpha/Tint

All display objects have an alpha property. In addition, shape objects have a tint which is set either via object:setFillColor() or the color channel properties (r, g, b, a) of the object's fill property.

Generally, your shader should incorporate the effect of these properties into the color that your fragment kernel returns. You can do this by calling the following function to calculate the correct color:

P_COLOR vec4 CoronaColorScale( P_COLOR vec4 color );

This function takes an input color vector (red, green, blue, and alpha channels) and returns a color vector modulated by the display object's tint and alpha, as shown in the fragment kernel examples. Generally, you should call this function at the end of the fragment kernel so that you can properly calculate the color vector your fragment kernel should return.

Example

The following example brightens an image by a fixed amount per color component:

local kernel = {}

-- "filter.custom.myBrighten"
kernel.category = "filter"
kernel.name = "myBrighten"

kernel.fragment =
[[
P_COLOR vec4 FragmentKernel( P_UV vec2 texCoord )
{
    P_COLOR float brightness = 0.5;
    P_COLOR vec4 texColor = texture2D( CoronaSampler0, texCoord );

    // Pre-multiply the alpha to brightness
    brightness = brightness * texColor.a;

    // Add the brightness
    texColor.rgb += brightness;

    // Modulate by the display object's combined alpha/tint.
    return CoronaColorScale( texColor );
}
]]

Custom Varying Variables

A "varying" variable enables data to be passed from a vertex shader to the fragment shader. The vertex shader outputs this value which corresponds to the positions of the primitive's vertices. In turn, the fragment shader linearly interpolates this value across the primitive during rasterization.

In Corona, you can declare your own varying variables in the shader code. You should put them at the beginning of both your vertex and fragment code.

Example

The following example combines the wobble vertex and brighten fragment kernels. Unlike the "myBrighten" fragment example above, this version does not use a fixed value for brightness. Instead, the vertex shader calculates an oscillating brightness value for each vertex and the fragment shader linearly interpolates the brightness value according to the pixel it's shading.

local kernel = {}
kernel.category = "filter"
kernel.name = "wobbleAndBrighten"

-- Shader code uses time environment variable CoronaTotalTime
kernel.isTimeDependent = true

kernel.vertex =
[[
varying P_COLOR float delta; // Custom varying variable

P_POSITION vec2 VertexKernel( P_POSITION vec2 position )
{
    P_POSITION float amplitude = 10;

    position.y += sin( 3.0 * CoronaTotalTime + CoronaTexCoord.x ) * amplitude * CoronaTexCoord.y;

    // Calculate value for varying
    delta = 0.4*(CoronaTexCoord.y + sin( 3.0 * CoronaTotalTime + 2.0 * CoronaTexCoord.x ));

    return position;
}
]]

kernel.fragment =
[[
varying P_COLOR float delta; // Matches declaration in vertex shader

P_COLOR vec4 FragmentKernel( P_UV vec2 texCoord )
{
    // Brightness changes based on interpolated value of custom varying variable
    P_COLOR float brightness = delta;

    P_COLOR vec4 texColor = texture2D( CoronaSampler0, texCoord );

    // Pre-multiply the alpha to brightness
    brightness *= texColor.a;

    // Add the brightness
    texColor.rgb += brightness;

    // Modulate by the display object's combined alpha/tint.
    return CoronaColorScale( texColor );
}

Effect Parameters

In Corona, you can pass effect parameters by setting appropriate properties on the effect of a ShapeObject. These properties depend on the effect. For example, the built-in brightness filter has an intensity parameter that can be propagated to the shader code:

object.fill.effect = "filter.brightness"
object.fill.effect.intensity = 0.4

Corona supports two methods for adding parameters to custom shader effects. These are mutually exclusive, so you must choose one or the other.

Method	Description
vertex userdata	Parameters are passed on a per-vertex basis. This generally performs better because changes to vertex data does not require OpenGL state changes. However, it's limited to 4 (scalar) values.
uniform userdata	Parameters are passed as uniforms. This is for effects which require more parameters than can be passed via vertex userdata.

Vertex Versus Uniform

On devices, OpenGL performs best when you are able to minimize state changes. This is because multiple objects can be batched into a single draw call if there are no state changes required between display objects.

Typically, it's best to use vertex userdata when you need to pass in effect parameters, because the parameter data can be passed in a vertex array. This maximizes the chance that Corona can batch draw calls together. This is especially true if you have numerous consecutive display objects with the same effect applied.

Vertex Userdata

When using vertex userdata to pass effect parameters, the effect parameters are copied for each vertex. To minimize the data size impact, the effect parameters are limited to a vec4 (vector of 4 floats). This is available as the following read-only vector variable in both the vertex and fragment kernels:

P_DEFAULT vec4 CoronaVertexUserData

For example, suppose you want to modify the above "filter.custom.myBrighten" effect example so that, in Lua, there is a "brightness" parameter for the effect:

object.fill.effect = "filter.custom.myBrighten"
object.fill.effect.brightness = 0.3

To accomplish this, you must instruct Corona to map the parameter name in Lua with the corresponding component in the vector returned by CoronaVertexUserData. The following code tells Corona that the "brightness" parameter is the first component (index = 0) of the CoronaVertexUserData vector.

kernel.vertexData =
{
    {
        name = "brightness",
        default = 0, 
        min = 0,
        max = 1,
        index = 0,  -- This corresponds to "CoronaVertexUserData.x"
    },
}

In the above array (kernel.vertexData), each element is a table and each table specifies:

name — The string name for the parameter exposed in Lua.
default — The default value.
min — The minimum value.
max — The maximum value.
index — The index for the corresponding vector component in CoronaVertexUserData:

index = 0 → CoronaVertexUserData.x
index = 1 → CoronaVertexUserData.y
index = 2 → CoronaVertexUserData.z
index = 3 → CoronaVertexUserData.w

Finally, modify the FragmentKernel to read the parameter value, accessing the parameter value via CoronaVertexUserData:

kernel.fragment =
[[
P_COLOR vec4 FragmentKernel( P_UV vec2 texCoord )
{
    P_COLOR float brightness = CoronaVertexUserData.x;

    ...
}
]]

Uniform Userdata

(forthcoming feature)

GLSL Conventions and Best Practices

GLSL has many flavors across mobile and desktop. Corona assumes the use of GLSL ES (OpenGL ES 2.0). To maximize compatibility and performance, you should follow the following conventions and best practices.

Corona Simulator vs Device

Performance

Shader performance on desktop GPUs will not be the same as on devices. Therefore, if you run your shader under the Corona Simulator or in the Corona Shader Playground, you should run it on actual devices to be sure that you are getting the desired performance.

Also note that if you are supporting devices from different manufacturers, the performance between them could vary significantly. On Android in particular, some high-end devices actually have underpowered GPUs, so you should not assume that you will get equal performance across different high-end Android devices.

Syntax

The Corona Simulator compiles your shader using desktop GLSL. Consequently, if you run your shader in the Corona Simulator, your shader may still contain GLSL ES errors that will not appear until you attempt to run your shader on a device.

If you have a fragment-only kernel shader effect, you can test out your shader code in the Corona Shader Playground. This playground verifies against GLSL ES in a WebGL-enabled browser.

Precision Qualifier Macros

Unlike other flavors of GLSL, GLSL ES (OpenGL ES 2.0) generally requires precision qualifiers to be specified in variable declarations. Thus, it's a good practice to be explicit about precision.

Instead of using raw precision qualifiers like lowp, you should use one of the following precision qualifier macros. The defaults are optimized for the type of data:

P_DEFAULT — For generic values; default is highp.
P_RANDOM — For random values; default is highp.
P_POSITION — For positions; default is mediump.
P_NORMAL — For normals; default is mediump.
P_UV — For texture coordinates; default is mediump.
P_COLOR — For pixel colors; default is lowp.

We strongly recommend you use Corona's defaults for shader precision, all of which have been optimized to balance performance and fidelity. However, your project can override these settings in config.lua (guide).

High-Precision Devices

Not all devices support high precision. Therefore, if your kernel requires high precision, you should use the GL_FRAGMENT_PRECISION_HIGH macro. This is 1 if high precision is supported on the device, or undefined otherwise.

If the device does not support highp, your kernel can gracefully degrade by writing two implementations:

P_COLOR vec4 FragmentKernel( P_UV vec2 texCoord )
{
#ifdef GL_FRAGMENT_PRECISION_HIGH
    // Code path for high precision calculations
#else
    // Code path for fallback
#endif
}

Pre-Multiplied Alpha

Corona provides textures with pre-multiplied alpha. Therefore, you may need to divide by the alpha to recover the original RGB values. However, for performance reasons, you should try to perform calculations to avoid the divide. Compare the following two kernels that brighten an image:

In the following, the original RGB values are recovered by undoing the pre-multiplied alpha, and later, the alpha is re-applied. This is not ideal because it generates a lot of additional operations on the GPU for every pixel.

// Non-optimal Version
P_COLOR vec4 FragmentKernel( P_UV vec2 texCoord )
{
    P_COLOR float brightness = 0.5;
    P_COLOR vec4 texColor = texture2D( CoronaSampler0, texCoord );

    // BAD: Recover original RGBs via divide
    texColor.rgb /= texColor.a;

    // Add the brightness
    texColor.rgb += brightness;

    // BAD: Re-apply the pre-multiplied alpha
    texColor.rgb *= texColor.a;

    return CoronaColorScale( texColor );
}

This version pre-multiplies the alpha of the texture to the brightness variable so that it can be added directly to the texture's RGB values. This circumvents the deficiencies of the above implementation.

// Optimal Version
P_COLOR vec4 FragmentKernel( P_UV vec2 texCoord )
{
    P_COLOR float brightness = 0.5;
    P_COLOR vec4 texColor = texture2D( CoronaSampler0, texCoord );

    // GOOD: Pre-multiply the alpha to brightness
    brightness = brightness * texColor.a;

    // Add the brightness
    texColor.rgb += brightness;

    return CoronaColorScale( texColor );
}

Vector Calculations

Some devices do not have GPUs with vector processors. In those cases, vector calculations may be performed on a scalar processor. Generally, you should carefully consider the order of operations in your shader to ensure that unnecessary calculations can be avoided on a scalar processor.

Consolidate Scalar Calculations

In the following example, a vector processor would execute each multiplication in parallel. However, because of the order of operations, a scalar processor would perform 8 multiplications, even though only one of the three parameters is a scalar value.

P_DEFAULT float f0, f1;
P_DEFAULT vec4 v0, v1;
v0 = (v1 * f0) * f1; // BAD: Multiply each scalar to a vector

A better ordering would be to multiply the two scalars first, then multiply the result against the vector. This reduces the calculation to 5 multiplies.

highp float f0, f1;
highp vec4 v0, v1;
v0 = v1 * (f0 * f1); // GOOD: Multiply scalars first

Use Write Masks

Similar logic applies when your vector calculation does not use all components. A "write mask" allows you to limit the calculations to only the components specified in the mask. The following runs twice as fast on a scalar processor because the write mask is used to specify that only two of the four components are needed.

highp vec4 v0;
highp vec4 v1;
highp vec4 v2;
v2.xz = v0 * v1; // GOOD: Write mask limits calculations

Avoid Dynamic Texture Lookups

When a fragment shader samples textures at a location different from the texture coordinate passed to the shader, it causes a dynamic texture lookup, also known as "dependent texture reads." In OpenGL-ES 2.0, dependent texture reads can delay texel data loading and reduce performance. This is why certain effects that sample a region of texels, for instance blur effects, are slower.

In contrast, effects that have no dependent texture reads enable the GPU to pre-fetch texel data before the shader executes, reducing I/O latency.

Avoid Branching and Loops

Branching instructions (if conditions) are expensive. When possible, for loops should be unrolled or replaced by vector operations.