Configuration guidelines

* added detailed guidelines for the selection of configuration values
* added additional compile-time checks to prevent bad configurations
* removed RANDOMX_SUPERSCALAR_MAX_SIZE parameter
pr-configuration
tevador 5 years ago
parent 37e9e77905
commit 9bae4d0bfb

@ -45,7 +45,7 @@ RandomX was primarily designed as a PoW algorithm for [Monero](https://www.getmo
* The key `K` is selected to be the hash of a block in the blockchain - this block is called the 'key block'. For optimal mining and verification performance, the key should change every 2048 blocks (~2.8 days) and there should be a delay of 64 blocks (~2 hours) between the key block and the change of the key `K`. This can be achieved by changing the key when `blockHeight % 2048 == 64` and selecting key block such that `keyBlockHeight % 2048 == 0`.
* The input `H` is the standard hashing blob.
If you wish to use RandomX as a PoW algorithm for your cryptocurrency, we strongly recommend not using the [default parameters](src/configuration.h) to avoid compatibility with Monero.
If you wish to use RandomX as a PoW algorithm for your cryptocurrency, please follow the [configuration guidelines](doc/configuration.md).
### CPU mining performance
Preliminary performance of selected CPUs using the optimal number of threads (T) and large pages (if possible), in hashes per second (H/s):

@ -0,0 +1,269 @@
# RandomX configuration
RandomX has 45 customizable parameters (see table below). We recommend each project using RandomX to select a unique configuration to prevent network attacks from hashpower rental services.
These parameters can be modified in source file [configuration.h](../src/configuration.h).
|parameter|description|default value|
|---------|-----|-------|
|`RANDOMX_ARGON_MEMORY`|The number of 1 KiB Argon2 blocks in the Cache| `262144`|
|`RANDOMX_ARGON_ITERATIONS`|The number of Argon2d iterations for Cache initialization|`3`|
|`RANDOMX_ARGON_LANES`|The number of parallel lanes for Cache initialization|`1`|
|`RANDOMX_ARGON_SALT`|Argon2 salt|`"RandomX\x03"`|
|`RANDOMX_CACHE_ACCESSES`|The number of random Cache accesses per Dataset item|`8`|
|`RANDOMX_SUPERSCALAR_LATENCY`|Target latency for SuperscalarHash (in cycles of the reference CPU)|`170`|
|`RANDOMX_DATASET_BASE_SIZE`|Dataset base size in bytes|`2147483648`|
|`RANDOMX_DATASET_EXTRA_SIZE`|Dataset extra size in bytes|`33554368`|
|`RANDOMX_PROGRAM_SIZE`|The number of instructions in a RandomX program|`256`|
|`RANDOMX_PROGRAM_ITERATIONS`|The number of iterations per program|`2048`|
|`RANDOMX_PROGRAM_COUNT`|The number of programs per hash|`8`|
|`RANDOMX_JUMP_BITS`|Jump condition mask size in bits|`8`|
|`RANDOMX_JUMP_OFFSET`|Jump condition mask offset in bits|`8`|
|`RANDOMX_SCRATCHPAD_L3`|Scratchpad size in bytes|`2097152`|
|`RANDOMX_SCRATCHPAD_L2`|Scratchpad L2 size in bytes|`262144`|
|`RANDOMX_SCRATCHPAD_L1`|Scratchpad L1 size in bytes|`16384`|
|`RANDOMX_FREQ_*` (29x)|Instruction frequencies|multiple values|
Not all of the parameters can be changed safely and most parameters have some contraints on what values can be selected. Follow the guidelines below.
### RANDOMX_ARGON_MEMORY
This parameter determines the amount of memory needed in the light mode. Memory is specified in KiB (1 KiB = 1024 bytes).
#### Permitted values
Any integer power of 2.
#### Notes
Lower sizes will reduce the memory-hardness of the algorithm.
### RANDOMX_ARGON_ITERATIONS
Determines the number of passes of Argon2 that are used to generate the Cache.
#### Permitted values
Any positive integer.
#### Notes
The time needed to initialize the Cache is proportional to the value of this constant.
### RANDOMX_ARGON_LANES
The number of parallel lanes for Cache initialization.
#### Permitted values
Any positive integer.
#### Notes
This parameter determines how many threads can be used for Cache initialization.
### RANDOMX_ARGON_SALT
Salt value for Cache initialization.
#### Permitted values
Any string of byte values.
#### Note
Every implementation should choose a unique salt value.
### RANDOMX_CACHE_ACCESSES
The number of random Cache access per Dataset item.
#### Permitted values
Any integer greater than 1.
#### Notes
This value directly determines the performance ratio between the 'fast' and 'light' modes.
### RANDOMX_SUPERSCALAR_LATENCY
Target latency for SuperscalarHash, in cycles of the reference CPU.
#### Permitted values
Any positive integer.
#### Notes
The default value was tuned so that a high-performance superscalar CPU running at 2-4 GHz will execute SuperscalarHash in similar time it takes to load data from RAM (40-80 ns). Using a lower value will make Dataset generation (and light mode) more memory bound, while increasing this value will make Dataset generation (and light mode) more compute bound.
### RANDOMX_DATASET_BASE_SIZE
Dataset base size in bytes.
#### Permitted values
Integer powers of 2 in the range 64 - 4294967296 (inclusive).
#### Note
This constant affects the memory requirements in fast mode. Some values are unsafe depending on other parameters. See [Unsafe configurations](#unsafe-configurations).
### RANDOMX_DATASET_EXTRA_SIZE
Dataset extra size in bytes.
#### Permitted values
Non-negative integer divisible by 64.
#### Note
This constant affects the memory requirements in fast mode. Some values are unsafe depending on other parameters. See [Unsafe configurations](#unsafe-configurations).
### RANDOMX_PROGRAM_SIZE
The number of instructions in a RandomX program.
#### Permitted values
Any positive integer divisible by 8.
#### Notes
Smaller values will make RandomX more DRAM-latency bound, while higher values will make RandomX more compute-bound. Some values are unsafe. See [Unsafe configurations](#unsafe-configurations).
### RANDOMX_PROGRAM_ITERATIONS
The number of iterations per program.
#### Permitted values
Any positive integer.
#### Notes
Time per hash increases linearly with this constant. Smaller values will increase the overhead of program compilation, while larger values may allow more time for optimizations. Some values are unsafe. See [Unsafe configurations](#unsafe-configurations).
### RANDOMX_PROGRAM_COUNT
The number of programs per hash.
#### Permitted values
Any positive integer.
#### Notes
Time per hash increases linearly with this constant. Some values are unsafe. See [Unsafe configurations](#unsafe-configurations).
### RANDOMX_JUMP_BITS
Jump condition mask size in bits.
#### Permitted values
Positive integers. The sum of `RANDOMX_JUMP_BITS` and `RANDOMX_JUMP_OFFSET` must not exceed 16.
#### Notes
This determines the jump probability of the CBRANCH instruction. The default value of 8 results in jump probability of <code>1/2<sup>8</sup> = 1/256</code>. Increasing this constant will decrease the rate of jumps (and vice versa).
### RANDOMX_JUMP_OFFSET
Jump condition mask offset in bits.
#### Permitted values
Non-negative integers. The sum of `RANDOMX_JUMP_BITS` and `RANDOMX_JUMP_OFFSET` must not exceed 16.
#### Notes
Since the low-order bits of RandomX registers are slightly biased, this offset moves the condition mask to higher bits, which are less biased. Using values smaller than the default may result in a slightly lower jump probability than the theoretical value calculated from `RANDOMX_JUMP_BITS`.
### RANDOMX_SCRATCHPAD_L3
RandomX Scratchpad size in bytes.
#### Permitted values
Any integer power of 2. Must be larger than or equal to `RANDOMX_SCRATCHPAD_L2`.
#### Notes
The default value of 2 MiB was selected to match the typical cache/core ratio of desktop processors. Using a lower value will make RandomX more core-bound, while using larger values will make the algorithm more latency-bound. Some values are unsafe depending on other parameters. See [Unsafe configurations](#unsafe-configurations).
### RANDOMX_SCRATCHPAD_L2
Scratchpad L2 size in bytes.
#### Permitted values
Any integer power of 2. Must be larger than or equal to `RANDOMX_SCRATCHPAD_L1`.
#### Notes
The default value of 256 KiB was selected to match the typical per-core L2 cache size of desktop processors. Using a lower value will make RandomX more core-bound, while using larger values will make the algorithm more latency-bound.
### RANDOMX_SCRATCHPAD_L1
Scratchpad L1 size in bytes.
#### Permitted values
Any integer power of 2. The minimum is 64 bytes.
#### Notes
The default value of 16 KiB was selected to be about half of the per-core L1 cache size of desktop processors. Using a lower value will make RandomX more core-bound, while using larger values will make the algorithm more latency-bound.
### RANDOMX_FREQ_*
Instruction frequencies (per 256 instructions).
#### Permitted values
There is a total of 29 different instructions. The sum of frequencies must be equal to 256.
#### Notes
Making large changes to the default values is not recommended. The only exceptions are the instruction pairs IROR_R/IROL_R, FADD_R/FSUB_R and FADD_M/FSUB_M, which are functionally equivalent.
## Unsafe configurations
There are some configurations that are considered 'unsafe' because they affect the security of the algorithm against attacks. If the conditions listed below are not satisfied, the configuration is unsafe and a compilation error is emitted when building the RandomX library.
These checks can be disabled by definining `RANDOMX_UNSAFE` when building RandomX, e.g. by using `-DRANDOMX_UNSAFE` command line switch in GCC or MSVC. It is not recommended to disable these checks except for testing purposes.
### 1. Memory-time tradeoffs
#### Condition
````
RANDOMX_CACHE_ACCESSES * RANDOMX_ARGON_MEMORY * 1024 + 33554432 >= RANDOMX_DATASET_BASE_SIZE + RANDOMX_DATASET_EXTRA_SIZE
````
Configurations not satisfying this condition are vulnerable to memory-time tradeoffs, which enables efficient mining in light mode.
#### Solutions
* Increase `RANDOMX_CACHE_ACCESSES` or `RANDOMX_ARGON_MEMORY`.
* Decrease `RANDOMX_DATASET_BASE_SIZE` or `RANDOMX_DATASET_EXTRA_SIZE`.
### 2. Insufficient Scratchpad writes
#### Condition
````
(128 + RANDOMX_PROGRAM_SIZE * RANDOMX_FREQ_ISTORE / 256) * (RANDOMX_PROGRAM_COUNT * RANDOMX_PROGRAM_ITERATIONS) >= RANDOMX_SCRATCHPAD_L3
````
Configurations not satisfying this condition are vulnerable to Scratchpad size optimizations due to low amount of writes.
#### Solutions
* Increase `RANDOMX_PROGRAM_SIZE`, `RANDOMX_FREQ_ISTORE`, `RANDOMX_PROGRAM_COUNT` or `RANDOMX_PROGRAM_ITERATIONS`.
* Decrease `RANDOMX_SCRATCHPAD_L3`.
### 3. Program filtering strategies
#### Condition
```
RANDOMX_PROGRAM_COUNT > 1
```
Configurations not satisfying this condition are vulnerable to program filtering strategies.
#### Solution
* Increase `RANDOMX_PROGRAM_COUNT` to at least 2.
### 4. Low program entropy
#### Condition
```
RANDOMX_PROGRAM_SIZE >= 64
```
Configurations not satisfying this condition do not have a sufficient number of instruction combinations.
#### Solution
* Increase `RANDOMX_PROGRAM_SIZE` to at least 64.
### 5. High compilation overhead
#### Condition
```
RANDOMX_PROGRAM_ITERATIONS >= 400
```
Configurations not satisfying this condition have a program compilation overhead exceeding 10%.
#### Solution
* Increase `RANDOMX_PROGRAM_ITERATIONS` to at least 400.

@ -62,7 +62,6 @@ RandomX has several configurable parameters that are listed in Table 1.2.1 with
|`RANDOMX_ARGON_SALT`|Argon2 salt|`"RandomX\x03"`|
|`RANDOMX_CACHE_ACCESSES`|The number of random Cache accesses per Dataset item|`8`|
|`RANDOMX_SUPERSCALAR_LATENCY`|Target latency for SuperscalarHash (in cycles of the reference CPU)|`170`|
|`RANDOMX_SUPERSCALAR_MAX_SIZE`|The maximum number of instructions of SuperscalarHash|`512`|
|`RANDOMX_DATASET_BASE_SIZE`|Dataset base size in bytes|`2147483648`|
|`RANDOMX_DATASET_EXTRA_SIZE`|Dataset extra size in bytes|`33554368`|
|`RANDOMX_PROGRAM_SIZE`|The number of instructions in a RandomX program|`256`|
@ -786,7 +785,7 @@ SuperscalarHash programs are generated to maximize the usage of all 3 execution
Program generation is complete when one of two conditions is met:
1. An instruction is scheduled for execution on cycle that is equal to or greater than `RANDOMX_SUPERSCALAR_LATENCY`
1. The number of generated instructions reaches `RANDOMX_SUPERSCALAR_MAX_SIZE`
1. The number of generated instructions reaches `3 * RANDOMX_SUPERSCALAR_LATENCY + 2`.
#### 6.3.1 Decoding stage

@ -5,7 +5,6 @@ RANDOMX_ARGON_LANES EQU 1t
RANDOMX_ARGON_SALT TEXTEQU <"RandomX\x03">
RANDOMX_CACHE_ACCESSES EQU 8t
RANDOMX_SUPERSCALAR_LATENCY EQU 170t
RANDOMX_SUPERSCALAR_MAX_SIZE EQU 512t
RANDOMX_DATASET_BASE_SIZE EQU 2147483648t
RANDOMX_DATASET_EXTRA_SIZE EQU 33554368t
RANDOMX_PROGRAM_SIZE EQU 256t

@ -37,7 +37,9 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
namespace randomx {
static_assert(RANDOMX_ARGON_MEMORY > 0, "RANDOMX_ARGON_MEMORY must be greater than 0.");
static_assert((RANDOMX_ARGON_MEMORY & (RANDOMX_ARGON_MEMORY - 1)) == 0, "RANDOMX_ARGON_MEMORY must be a power of 2.");
static_assert(RANDOMX_DATASET_BASE_SIZE >= 64, "RANDOMX_DATASET_BASE_SIZE must be at least 64.");
static_assert((RANDOMX_DATASET_BASE_SIZE & (RANDOMX_DATASET_BASE_SIZE - 1)) == 0, "RANDOMX_DATASET_BASE_SIZE must be a power of 2.");
static_assert(RANDOMX_DATASET_BASE_SIZE <= 4294967296ULL, "RANDOMX_DATASET_BASE_SIZE must not exceed 4294967296.");
static_assert(RANDOMX_DATASET_EXTRA_SIZE % 64 == 0, "RANDOMX_DATASET_EXTRA_SIZE must be divisible by 64.");
@ -48,8 +50,10 @@ namespace randomx {
static_assert(RANDOMX_SCRATCHPAD_L3 >= RANDOMX_SCRATCHPAD_L2, "RANDOMX_SCRATCHPAD_L3 must be greater than or equal to RANDOMX_SCRATCHPAD_L2.");
static_assert((RANDOMX_SCRATCHPAD_L2 & (RANDOMX_SCRATCHPAD_L2 - 1)) == 0, "RANDOMX_SCRATCHPAD_L2 must be a power of 2.");
static_assert(RANDOMX_SCRATCHPAD_L2 >= RANDOMX_SCRATCHPAD_L1, "RANDOMX_SCRATCHPAD_L2 must be greater than or equal to RANDOMX_SCRATCHPAD_L1.");
static_assert(RANDOMX_SCRATCHPAD_L1 >= 64, "RANDOMX_SCRATCHPAD_L1 must be at least 64.");
static_assert((RANDOMX_SCRATCHPAD_L1 & (RANDOMX_SCRATCHPAD_L1 - 1)) == 0, "RANDOMX_SCRATCHPAD_L1 must be a power of 2.");
static_assert(RANDOMX_CACHE_ACCESSES > 1, "RANDOMX_CACHE_ACCESSES must be greater than 1");
static_assert(RANDOMX_SUPERSCALAR_LATENCY > 0, "RANDOMX_SUPERSCALAR_LATENCY must be greater than 0");
static_assert(RANDOMX_JUMP_BITS > 0, "RANDOMX_JUMP_BITS must be greater than 0.");
static_assert(RANDOMX_JUMP_OFFSET >= 0, "RANDOMX_JUMP_OFFSET must be greater than or equal to 0.");
static_assert(RANDOMX_JUMP_BITS + RANDOMX_JUMP_OFFSET <= 16, "RANDOMX_JUMP_BITS + RANDOMX_JUMP_OFFSET must not exceed 16.");
@ -64,8 +68,10 @@ namespace randomx {
static_assert(wtSum == 256, "Sum of instruction frequencies must be 256.");
constexpr int ArgonBlockSize = 1024;
constexpr int ArgonSaltSize = sizeof(RANDOMX_ARGON_SALT) - 1;
constexpr int ArgonSaltSize = sizeof("" RANDOMX_ARGON_SALT) - 1;
constexpr int SuperscalarMaxSize = 3 * RANDOMX_SUPERSCALAR_LATENCY + 2;
constexpr int CacheLineSize = RANDOMX_DATASET_ITEM_SIZE;
constexpr int ScratchpadSize = RANDOMX_SCRATCHPAD_L3;
constexpr uint32_t CacheLineAlignMask = (RANDOMX_DATASET_BASE_SIZE - 1) & ~(CacheLineSize - 1);
@ -76,6 +82,15 @@ namespace randomx {
constexpr int ConditionOffset = RANDOMX_JUMP_OFFSET;
constexpr int StoreL3Condition = 14;
//Prevent some unsafe configurations.
#ifndef RANDOMX_UNSAFE
static_assert(RANDOMX_CACHE_ACCESSES * RANDOMX_ARGON_MEMORY * ArgonBlockSize + 33554432 >= RANDOMX_DATASET_BASE_SIZE + RANDOMX_DATASET_EXTRA_SIZE, "Unsafe configuration: Memory-time tradeoffs");
static_assert((128 + RANDOMX_PROGRAM_SIZE * RANDOMX_FREQ_ISTORE / 256) * (RANDOMX_PROGRAM_COUNT * RANDOMX_PROGRAM_ITERATIONS) >= RANDOMX_SCRATCHPAD_L3, "Unsafe configuration: Insufficient Scratchpad writes");
static_assert(RANDOMX_PROGRAM_COUNT > 1, "Unsafe configuration: Program filtering strategies");
static_assert(RANDOMX_PROGRAM_SIZE >= 64, "Unsafe configuration: Low program entropy");
static_assert(RANDOMX_PROGRAM_ITERATIONS >= 400, "Unsafe configuration: High compilation overhead");
#endif
#ifdef TRACE
constexpr bool trace = true;
#else

@ -31,10 +31,10 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//Cache size in KiB. Must be a power of 2.
#define RANDOMX_ARGON_MEMORY 262144
//Number of Argon2d iterations for Cache initialization
//Number of Argon2d iterations for Cache initialization.
#define RANDOMX_ARGON_ITERATIONS 3
//Number of parallel lanes for Cache initialization
//Number of parallel lanes for Cache initialization.
#define RANDOMX_ARGON_LANES 1
//Argon2d salt
@ -46,22 +46,19 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//Target latency for SuperscalarHash (in cycles of the reference CPU).
#define RANDOMX_SUPERSCALAR_LATENCY 170
//The maximum size of a SuperscalarHash program (number of instructions).
#define RANDOMX_SUPERSCALAR_MAX_SIZE 512
//Dataset base size in bytes. Must be a power of 2.
#define RANDOMX_DATASET_BASE_SIZE 2147483648
//Dataset extra size. Must be divisible by 64.
#define RANDOMX_DATASET_EXTRA_SIZE 33554368
//Number of instructions in a RandomX program
//Number of instructions in a RandomX program. Must be divisible by 8.
#define RANDOMX_PROGRAM_SIZE 256
//Number of iterations during VM execution
//Number of iterations during VM execution.
#define RANDOMX_PROGRAM_ITERATIONS 2048
//Number of chained VM executions per hash
//Number of chained VM executions per hash.
#define RANDOMX_PROGRAM_COUNT 8
//Scratchpad L3 size in bytes. Must be a power of 2.
@ -70,13 +67,13 @@ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//Scratchpad L2 size in bytes. Must be a power of two and less than or equal to RANDOMX_SCRATCHPAD_L3.
#define RANDOMX_SCRATCHPAD_L2 262144
//Scratchpad L1 size in bytes. Must be a power of two and less than or equal to RANDOMX_SCRATCHPAD_L2.
//Scratchpad L1 size in bytes. Must be a power of two (minimum 64) and less than or equal to RANDOMX_SCRATCHPAD_L2.
#define RANDOMX_SCRATCHPAD_L1 16384
//Jump condition mask size in bits.
#define RANDOMX_JUMP_BITS 8
//Jump condition mask offset in bits.
//Jump condition mask offset in bits. The sum of RANDOMX_JUMP_BITS and RANDOMX_JUMP_OFFSET must not exceed 16.
#define RANDOMX_JUMP_OFFSET 8
/*
@ -84,6 +81,7 @@ Instruction frequencies (per 256 opcodes)
Total sum of frequencies must be 256
*/
//Integer instructions
#define RANDOMX_FREQ_IADD_RS 25
#define RANDOMX_FREQ_IADD_M 7
#define RANDOMX_FREQ_ISUB_R 16
@ -102,6 +100,7 @@ Total sum of frequencies must be 256
#define RANDOMX_FREQ_IROL_R 0
#define RANDOMX_FREQ_ISWAP_R 4
//Floating point instructions
#define RANDOMX_FREQ_FSWAP_R 8
#define RANDOMX_FREQ_FADD_R 20
#define RANDOMX_FREQ_FADD_M 5
@ -112,11 +111,14 @@ Total sum of frequencies must be 256
#define RANDOMX_FREQ_FDIV_M 4
#define RANDOMX_FREQ_FSQRT_R 6
//Control instructions
#define RANDOMX_FREQ_CBRANCH 16
#define RANDOMX_FREQ_CFROUND 1
//Store instruction
#define RANDOMX_FREQ_ISTORE 16
//No-op instruction
#define RANDOMX_FREQ_NOP 0
/* ------
256

@ -673,8 +673,8 @@ namespace randomx {
//Each decode cycle decodes 16 bytes of x86 code.
//Since a decode cycle produces on average 3.45 macro-ops and there are only 3 ALU ports, execution ports are always
//saturated first. The cycle limit is present only to guarantee loop termination.
//Program size is limited to RANDOMX_SUPERSCALAR_MAX_SIZE instructions.
for (decodeCycle = 0; decodeCycle < RANDOMX_SUPERSCALAR_LATENCY && !portsSaturated && programSize < RANDOMX_SUPERSCALAR_MAX_SIZE; ++decodeCycle) {
//Program size is limited to SuperscalarMaxSize instructions.
for (decodeCycle = 0; decodeCycle < RANDOMX_SUPERSCALAR_LATENCY && !portsSaturated && programSize < SuperscalarMaxSize; ++decodeCycle) {
//select a decode configuration
decodeBuffer = decodeBuffer->fetchNext(currentInstruction.getType(), decodeCycle, mulCount, gen);
@ -688,7 +688,7 @@ namespace randomx {
//if we have issued all macro-ops for the current RandomX instruction, create a new instruction
if (macroOpIndex >= currentInstruction.getInfo().getSize()) {
if (portsSaturated || programSize >= RANDOMX_SUPERSCALAR_MAX_SIZE)
if (portsSaturated || programSize >= SuperscalarMaxSize)
break;
//select an instruction so that the first macro-op fits into the current slot
currentInstruction.createForSlot(gen, decodeBuffer->getCounts()[bufferIndex], decodeBuffer->getIndex(), decodeBuffer->getSize() == bufferIndex + 1, bufferIndex == 0);

@ -56,7 +56,7 @@ namespace randomx {
addrReg = val;
}
Instruction programBuffer[RANDOMX_SUPERSCALAR_MAX_SIZE];
Instruction programBuffer[SuperscalarMaxSize];
uint32_t size;
int addrReg;
double ipc;

Loading…
Cancel
Save