# SET File Implementation Guide
**Version 4.2**  
**Updated: December 2025**

---

## About This Guide

This document provides comprehensive guidance for implementing SET file parsers, query languages, and extended functionality. These are **recommendations and patterns** for building SET-based systems, not requirements of the core specification.

**For the specifications, see:** 

[SET File Format One-Page Quick-Start](SET_File_OnePage.md)

[SET File Format Core Specification v4.2](SET_File_Core_v4.2.md)

[SET File Format Full Specification v4.2](SET_File_Full_Spec_v4.2.md)

[SET File Format QSet Specification v4.2](QSet_Spec_v4.2.md)


---

## Table of Contents

1. [Query Language (SetQL)](#1-query-language-setql)
2. [Implementation Patterns & Conventions](#2-implementation-patterns--conventions)
3. [SetTag Extensions](#3-settag-extensions)
4. [CRUD Operations](#4-crud-operations)
5. [Validation & Error Handling](#5-validation--error-handling)
6. [Programming Interface Guidelines](#6-programming-interface-guidelines)
7. [Complete Examples](#7-complete-examples)
8. [Version History & Migration](#8-version-history--migration)

---

## 1. Query Language (SetQL)

SetQL provides a simple query syntax for filtering and selecting data from SET files. This is an **optional feature** - implementations may choose whether to support it.

### 1.1 Basic Syntax

```
FROM [GroupName] WHERE field=value
FROM [GroupName] SELECT field1,field2
FROM [GroupName] WHERE field>value ORDER BY field
```

### 1.2 Supported Operations

**Comparison Operators:**
- `=` Equal to
- `!=` Not equal to
- `>` Greater than
- `<` Less than
- `>=` Greater than or equal
- `<=` Less than or equal

**Pattern Matching:**
- `LIKE` Pattern matching (use `%` as wildcard)

**List Operations:**
- `IN` Value in list

**Logical Operators:**
- `AND` Combine conditions
- `OR` Alternative conditions

### 1.3 Query Components

**FROM clause** - Specifies the group to query
```
FROM [USERS]
FROM [DATABASE_CONFIG]
```

**WHERE clause** - Filters records
```
WHERE role='admin'
WHERE age>18 AND status='active'
WHERE email LIKE '%@example.com'
```

**SELECT clause** - Chooses specific fields
```
SELECT username,email
SELECT *
```

**ORDER BY clause** - Sorts results
```
ORDER BY last_name
ORDER BY age DESC
```

### 1.4 Examples

**Simple queries:**
```
FROM [USERS] WHERE role='admin'
FROM [PRODUCTS] WHERE price>100
FROM [SETTINGS] WHERE key LIKE 'Email%'
```

**Complex queries:**
```
FROM [EMPLOYEES] WHERE department='Engineering' AND salary>75000 ORDER BY hire_date
FROM [ORDERS] WHERE status IN ('pending','processing') AND total>500
FROM [CONTACTS] WHERE (city='Seattle' OR city='Portland') AND active=true
```

**With SELECT:**
```
FROM [USERS] SELECT username,email WHERE role='user'
FROM [PRODUCTS] SELECT name,price WHERE category='Electronics' ORDER BY price DESC
```

### 1.5 Text Block References

Text blocks are not directly queryable, but references are resolved in query results:

```
FROM [ARTICLES] SELECT title,body
```

If the `body` field contains `[{ARTICLE_1_BODY}]`, the query result includes the resolved text content.

### 1.6 Implementation Notes

- Query language is case-sensitive for field names and values
- String values should be quoted with single quotes
- Numeric values do not need quotes
- Boolean values: `true` / `false` (lowercase)
- NULL values: use empty string or special null handling
- Implementations may extend with additional operators or functions

---

## 2. Implementation Patterns & Conventions

The SET file format's simple structure enables many powerful patterns through creative use of existing features. These are **conventions, not requirements** - implementations choose what makes sense for their use cases.

### 2.1 Hierarchical Data via Dot Notation

Use dots in key names to represent nested structures.

```
[DATABASE_CONFIG]
host|localhost
port|5432
connection.pool.min|5
connection.pool.max|20
connection.timeout|30
ssl.enabled|true
ssl.cert|/path/to/cert.pem
[EOG]
```

**Simple parser:** Treats `connection.pool.min` as a single key  
**Advanced parser:** Builds nested structure:
```javascript
{
  host: "localhost",
  connection: {
    pool: { min: 5, max: 20 },
    timeout: 30
  },
  ssl: {
    enabled: true,
    cert: "/path/to/cert.pem"
  }
}
```

### 2.2 Runtime Calculation Fields (Implementation Pattern)

Some implementations may choose to support calculated fields that are generated at parse time rather than stored in the file.

**Convention:** Use `::` prefix to mark calculated fields in field definitions.

**Example:**
```
[SALES]
{date|product|amount|::tax|::total}
2025-01-01|Widget A|100.00
2025-01-02|Widget B|150.00
2025-01-03|Widget C|200.00
```

**Parser behavior:**
When the parser encounters `::tax` and `::total`, it:
1. Calculates tax (e.g., `amount * 0.08`)
2. Calculates total (e.g., `amount + tax`)
3. Adds these fields to the returned data structure

**Returned data might look like:**
```javascript
[
  {date: "2025-01-01", product: "Widget A", amount: "100.00", tax: "8.00", total: "108.00"},
  {date: "2025-01-02", product: "Widget B", amount: "150.00", tax: "12.00", total: "162.00"},
  {date: "2025-01-03", product: "Widget C", amount: "200.00", tax: "16.00", total: "216.00"}
]
```

**Notes:**
- This is **not part of the SET file format specification**
- It's a convention some implementations may choose to support
- The calculation logic is entirely implementation-specific
- The `::` prefix is just a convention to distinguish calculated from stored fields
- Simple parsers can ignore `::` fields or treat them as documentation

**Use cases:**
- Computed totals, subtotals, running totals
- Calculated dates (e.g., expiration date from creation date)
- Derived values (e.g., full name from first + last name)
- Display formatting (e.g., formatted currency from raw numbers)

### 2.3 Type Hints via Key Conventions

Add type information through naming conventions.

**Suffix notation:**
```
[SETTINGS]
maxUsers_int|50
timeout_float|30.5
debugMode_bool|true
database_null|
tags_array|development,testing,production
[EOG]
```

**Colon notation:**
```
[SETTINGS]
maxUsers:int|50
timeout:float|30.5
debugMode:bool|true
[EOG]
```

**Value prefixes:**
```
[SETTINGS]
maxUsers|i:50
timeout|f:30.5
debugMode|b:true
[EOG]
```

Choose whatever convention fits your implementation.

### 2.4 Arrays and Lists

**Horizontal arrays (positional fields):**
```
[COLORS]
{name|red|green|blue}
Primary Red|255|0|0
Sky Blue|135|206|235
Forest Green|34|139|34
[EOG]
```

**Vertical arrays (repeated keys):**
```
[ALLOWED_IPS]
ip|192.168.1.1
ip|192.168.1.2
ip|192.168.1.3
ip|10.0.0.5
[EOG]
```

**Comma-separated lists in values:**
```
[USER_ROLES]
admin|create,read,update,delete
editor|read,update
viewer|read
[EOG]
```

**Nested arrays using secondary delimiter:**
```
[PERMISSIONS]
{role|permissions}
admin|users:create!users:delete!posts:all
editor|posts:create!posts:edit!posts:delete
viewer|posts:read!comments:read
[EOG]
```

The secondary delimiter (default `!`) allows nested array structures within a single field.

### 2.5 Version Suffixes

Use version numbers in group names for managing configuration versions:

```
[DATABASE_V1]
host|localhost
port|3306
[EOG]

[DATABASE_V2]
host|db.example.com
port|5432
pool_size|20
[EOG]
```

### 2.6 Environment-Specific Configurations

```
[DATABASE_PRODUCTION]
host|prod.db.example.com
port|5432
[EOG]

[DATABASE_STAGING]
host|staging.db.example.com
port|5432
[EOG]

[DATABASE_DEVELOPMENT]
host|localhost
port|5432
[EOG]
```

### 2.7 Base64 Encoding for Binary Data

Store binary data as Base64-encoded text in text blocks:

```
[APP_CONFIG]
icon|[{APP_ICON_BASE64}]
certificate|[{SSL_CERT_BASE64}]
[EOG]

[{APP_ICON_BASE64}]
iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg==
[EOG]

[{SSL_CERT_BASE64}]
MIIDXTCCAkWgAwIBAgIJAKL0UG+mRfQNMA0GCSqGSIb3DQEBCwUAMEUxCzAJBgNV
BAYTAkFVMRMwEQYDVQQIDApTb21lLVN0YXRlMSEwHwYDVQQKDBhJbnRlcm5ldCBX
[EOG]
```

---

## 3. SetTag Extensions

SetTags allow embedding SET file content within other file formats using their native comment syntax.

### 3.1 Basic SetTag Syntax

**Pattern:**
```
<comment-start> {SETTAG:TagName}
<comment> SET file content here
<comment> {/SETTAG/} <comment-end>
```

### 3.2 Language-Specific Examples

**JavaScript/PHP:**
```javascript
// {SETTAG:AppConfig}
// [SETTINGS]
// AppName|MyApp
// Version|1.0
// {/SETTAG/}
```

**Python:**
```python
# {SETTAG:Config}
# [DATABASE]
# Host|localhost
# Port|5432
# {/SETTAG/}
```

**HTML/Markdown:**
```html
<!-- {SETTAG:PageConfig}
[META]
Title|My Page
Author|John Doe
{/SETTAG/} -->
```

**CSS:**
```css
/* {SETTAG:ThemeConfig}
[COLORS]
Primary|#007bff
Secondary|#6c757d
{/SETTAG/} */
```

**Rust:**
```rust
// {SETTAG:BuildConfig}
// [FEATURES]
// Debug|true
// Optimize|false
// {/SETTAG/}
```

### 3.3 Extracting SetTags

Implementations can extract SetTags by:

1. Reading the file line by line
2. Looking for `{SETTAG:Name}` markers
3. Collecting lines until `{/SETTAG/}` is found
4. Stripping comment characters and spaces
5. Parsing the extracted content as a SET file

**Example extraction function (pseudocode):**
```javascript
function extractSetTag(filename, tagName) {
  let lines = readFile(filename);
  let inTag = false;
  let content = [];
  
  for (let line of lines) {
    if (line.includes(`{SETTAG:${tagName}}`)) {
      inTag = true;
      continue;
    }
    if (line.includes('{/SETTAG/}')) {
      break;
    }
    if (inTag) {
      // Strip comment markers and leading space
      let cleaned = stripComments(line);
      content.push(cleaned);
    }
  }
  
  return parseSetContent(content.join('\n'));
}
```

### 3.4 Use Cases for SetTags

- **Configuration in source files**: Embed config data directly in application code
- **Metadata in documents**: Store structured metadata in markdown or HTML files
- **Build settings**: Include build configuration in the files they affect
- **Documentation**: Keep related configuration with its documentation
- **Version control**: Track configuration changes alongside code changes

---

## 4. CRUD Operations

This section provides guidance for implementing Create, Read, Update, and Delete operations on SET files.

### 4.1 Reading Data

**Read entire file:**
```javascript
function readSetFile(filename) {
  let content = readFile(filename);
  return parseSetFile(content);
}
```

**Read specific group:**
```javascript
function readGroup(filename, groupName) {
  let data = readSetFile(filename);
  return data.groups[groupName];
}
```

**Read specific key-value:**
```javascript
function readValue(filename, groupName, key) {
  let group = readGroup(filename, groupName);
  return group[key];
}
```

**Read with text block resolution:**
```javascript
function readWithReferences(filename, groupName, key) {
  let value = readValue(filename, groupName, key);
  
  // Check if value is a text block reference
  if (value.match(/^\[{.*}\]$/)) {
    let blockName = value.substring(2, value.length - 2);
    return readGroup(filename, blockName);
  }
  
  return value;
}
```

### 4.2 Creating Data

**Create new SET file:**
```javascript
function createSetFile(filename, config = {}) {
  let content = [];
  
  // Add filename identifier
  content.push(filename);
  content.push('');
  
  // Add [THIS-FILE] if config provided
  if (Object.keys(config).length > 0) {
    content.push('[THIS-FILE]');
    for (let [key, value] of Object.entries(config)) {
      content.push(`${key}|${value}`);
    }
    content.push('[EOG]');
    content.push('');
  }
  
  writeFile(filename, content.join('\n'));
}
```

**Add new group:**
```javascript
function addGroup(filename, groupName, data, isTextBlock = false) {
  let content = readFile(filename);
  
  // Add group header
  if (isTextBlock) {
    content += `\n[{${groupName}}]\n`;
    content += data + '\n';
  } else {
    content += `\n[${groupName}]\n`;
    for (let [key, value] of Object.entries(data)) {
      content += `${key}|${value}\n`;
    }
  }
  
  content += '[EOG]\n';
  writeFile(filename, content);
}
```

**Add line to existing group:**
```javascript
function addLine(filename, groupName, key, value) {
  let lines = readFileLines(filename);
  let inGroup = false;
  let insertIndex = -1;
  
  for (let i = 0; i < lines.length; i++) {
    if (lines[i].trim() === `[${groupName}]`) {
      inGroup = true;
      continue;
    }
    
    if (inGroup && (lines[i].trim() === '' || 
                     lines[i].trim().startsWith('[EOG]') ||
                     lines[i].trim().startsWith('['))) {
      insertIndex = i;
      break;
    }
  }
  
  if (insertIndex > -1) {
    lines.splice(insertIndex, 0, `${key}|${value}`);
    writeFileLines(filename, lines);
  }
}
```

### 4.3 Updating Data

**Update existing value:**
```javascript
function updateValue(filename, groupName, key, newValue) {
  let lines = readFileLines(filename);
  let inGroup = false;
  
  for (let i = 0; i < lines.length; i++) {
    if (lines[i].trim() === `[${groupName}]`) {
      inGroup = true;
      continue;
    }
    
    if (inGroup) {
      if (lines[i].trim() === '' || 
          lines[i].trim().startsWith('[EOG]') ||
          lines[i].trim().startsWith('[')) {
        break;
      }
      
      let parts = lines[i].split('|');
      if (parts[0].trim() === key) {
        parts[1] = newValue;
        lines[i] = parts.join('|');
        writeFileLines(filename, lines);
        return true;
      }
    }
  }
  
  return false; // Key not found
}
```

**Update entire group:**
```javascript
function updateGroup(filename, groupName, newData) {
  let lines = readFileLines(filename);
  let startIndex = -1;
  let endIndex = -1;
  
  for (let i = 0; i < lines.length; i++) {
    if (lines[i].trim() === `[${groupName}]`) {
      startIndex = i;
      continue;
    }
    
    if (startIndex > -1 && (lines[i].trim() === '' || 
                             lines[i].trim().startsWith('[EOG]') ||
                             lines[i].trim().startsWith('['))) {
      endIndex = i;
      break;
    }
  }
  
  if (startIndex > -1) {
    let newLines = [`[${groupName}]`];
    for (let [key, value] of Object.entries(newData)) {
      newLines.push(`${key}|${value}`);
    }
    newLines.push('[EOG]');
    
    lines.splice(startIndex, endIndex - startIndex, ...newLines);
    writeFileLines(filename, lines);
    return true;
  }
  
  return false;
}
```

### 4.4 Deleting Data

**Delete specific key:**
```javascript
function deleteKey(filename, groupName, key) {
  let lines = readFileLines(filename);
  let inGroup = false;
  
  for (let i = 0; i < lines.length; i++) {
    if (lines[i].trim() === `[${groupName}]`) {
      inGroup = true;
      continue;
    }
    
    if (inGroup) {
      if (lines[i].trim() === '' || 
          lines[i].trim().startsWith('[EOG]') ||
          lines[i].trim().startsWith('[')) {
        break;
      }
      
      let parts = lines[i].split('|');
      if (parts[0].trim() === key) {
        lines.splice(i, 1);
        writeFileLines(filename, lines);
        return true;
      }
    }
  }
  
  return false;
}
```

**Delete entire group:**
```javascript
function deleteGroup(filename, groupName) {
  let lines = readFileLines(filename);
  let startIndex = -1;
  let endIndex = -1;
  
  for (let i = 0; i < lines.length; i++) {
    if (lines[i].trim() === `[${groupName}]` || 
        lines[i].trim() === `[{${groupName}}]`) {
      startIndex = i;
      continue;
    }
    
    if (startIndex > -1 && (lines[i].trim() === '' || 
                             lines[i].trim().startsWith('[EOG]') ||
                             (lines[i].trim().startsWith('[') && i > startIndex))) {
      endIndex = i + 1; // Include [EOG] or next group
      break;
    }
  }
  
  if (startIndex > -1) {
    lines.splice(startIndex, endIndex - startIndex);
    writeFileLines(filename, lines);
    return true;
  }
  
  return false;
}
```

### 4.5 Transaction Safety

For critical operations, implement atomic writes:

```javascript
function atomicUpdate(filename, updateFunction) {
  let tempFile = filename + '.tmp';
  let backupFile = filename + '.backup';
  
  try {
    // Create backup
    copyFile(filename, backupFile);
    
    // Perform update on temp file
    copyFile(filename, tempFile);
    updateFunction(tempFile);
    
    // Verify temp file is valid
    if (validateSetFile(tempFile)) {
      // Replace original with temp
      moveFile(tempFile, filename);
      deleteFile(backupFile);
      return true;
    } else {
      throw new Error('Validation failed');
    }
  } catch (error) {
    // Restore from backup
    if (fileExists(backupFile)) {
      copyFile(backupFile, filename);
    }
    deleteFile(tempFile);
    return false;
  }
}
```

---

## 5. Validation & Error Handling

### 5.1 File-Level Validation

**Basic structure validation:**
```javascript
function validateSetFile(content) {
  let errors = [];
  let lines = content.split('\n');
  let inGroup = false;
  let inTextGroup = false;
  let currentGroup = null;
  let groupNames = new Set();
  
  for (let i = 0; i < lines.length; i++) {
    let line = lines[i];
    let trimmed = line.trim();
    
    // Check for group start
    if (trimmed.match(/^\[([A-Za-z0-9_-]+)\]$/)) {
      let groupName = trimmed.substring(1, trimmed.length - 1);
      
      // Check for duplicate group names
      if (groupNames.has(groupName)) {
        errors.push(`Line ${i + 1}: Duplicate group name '${groupName}'`);
      }
      groupNames.add(groupName);
      
      inGroup = true;
      inTextGroup = false;
      currentGroup = groupName;
      continue;
    }
    
    // Check for text group start
    if (trimmed.match(/^\[\{([A-Za-z0-9_-]+)\}\]$/)) {
      let groupName = trimmed.substring(2, trimmed.length - 2);
      
      if (groupNames.has(groupName)) {
        errors.push(`Line ${i + 1}: Duplicate group name '${groupName}'`);
      }
      groupNames.add(groupName);
      
      inGroup = true;
      inTextGroup = true;
      currentGroup = groupName;
      continue;
    }
    
    // Check for invalid group names
    if (trimmed.startsWith('[') && !trimmed.match(/^\[[A-Za-z0-9_-]+\]$/) && 
        !trimmed.match(/^\[\{[A-Za-z0-9_-]+\}\]$/) &&
        trimmed !== '[EOG]' && trimmed !== '[EOF]') {
      errors.push(`Line ${i + 1}: Invalid group name format '${trimmed}'`);
    }
    
    // Check for [EOG] or empty line ending groups
    if ((trimmed === '[EOG]' || trimmed === '') && !inTextGroup) {
      inGroup = false;
      currentGroup = null;
    }
  }
  
  return {
    valid: errors.length === 0,
    errors: errors
  };
}
```

### 5.2 Group-Level Validation

**Validate group structure:**
```javascript
function validateGroup(groupContent, groupType = 'data') {
  let errors = [];
  let lines = groupContent.split('\n').filter(l => l.trim());
  
  if (groupType === 'keyvalue') {
    for (let i = 0; i < lines.length; i++) {
      if (!lines[i].includes('|')) {
        errors.push(`Line ${i + 1}: Key-value line must contain delimiter '|'`);
      }
      
      let parts = lines[i].split('|');
      if (parts.length < 2) {
        errors.push(`Line ${i + 1}: Key-value line must have at least key and value`);
      }
      
      if (parts[0].trim() === '') {
        errors.push(`Line ${i + 1}: Key cannot be empty`);
      }
    }
  }
  
  if (groupType === 'table') {
    // First line should be field definition
    if (lines.length > 0 && !lines[0].startsWith('{')) {
      errors.push('Table group must start with field definition {field1|field2|...}');
    }
    
    if (lines.length > 1) {
      let fieldCount = lines[0].split('|').length;
      for (let i = 1; i < lines.length; i++) {
        let dataCount = lines[i].split('|').length;
        if (dataCount > fieldCount && !lines[i].includes(':::') && !lines[i].includes('…')) {
          errors.push(`Line ${i + 1}: Data field count exceeds definition (${dataCount} > ${fieldCount})`);
        }
      }
    }
  }
  
  return {
    valid: errors.length === 0,
    errors: errors
  };
}
```

### 5.3 Data Type Validation

**Validate data types (when using type conventions):**
```javascript
function validateTypes(data, schema) {
  let errors = [];
  
  for (let [key, expectedType] of Object.entries(schema)) {
    let value = data[key];
    
    switch(expectedType) {
      case 'int':
        if (!Number.isInteger(Number(value))) {
          errors.push(`${key}: Expected integer, got '${value}'`);
        }
        break;
        
      case 'float':
        if (isNaN(parseFloat(value))) {
          errors.push(`${key}: Expected float, got '${value}'`);
        }
        break;
        
      case 'bool':
        if (value !== 'true' && value !== 'false') {
          errors.push(`${key}: Expected boolean (true/false), got '${value}'`);
        }
        break;
        
      case 'email':
        if (!value.match(/^[^\s@]+@[^\s@]+\.[^\s@]+$/)) {
          errors.push(`${key}: Invalid email format '${value}'`);
        }
        break;
        
      case 'url':
        try {
          new URL(value);
        } catch {
          errors.push(`${key}: Invalid URL format '${value}'`);
        }
        break;
    }
  }
  
  return {
    valid: errors.length === 0,
    errors: errors
  };
}
```

### 5.4 Reference Validation

**Validate text block references:**
```javascript
function validateReferences(data) {
  let errors = [];
  let availableBlocks = new Set();
  
  // Collect all text block names
  for (let groupName of Object.keys(data.groups)) {
    if (groupName.startsWith('{') && groupName.endsWith('}')) {
      availableBlocks.add(groupName);
    }
  }
  
  // Check all references
  for (let [groupName, groupData] of Object.entries(data.groups)) {
    if (typeof groupData === 'object') {
      for (let [key, value] of Object.entries(groupData)) {
        if (typeof value === 'string' && value.match(/^\[\{.*\}\]$/)) {
          let refName = value.substring(2, value.length - 2);
          if (!availableBlocks.has(`{${refName}}`)) {
            errors.push(`${groupName}.${key}: Reference to undefined text block '${refName}'`);
          }
        }
      }
    }
  }
  
  return {
    valid: errors.length === 0,
    errors: errors
  };
}
```

### 5.5 Error Recovery Strategies

**Lenient parsing with error collection:**
```javascript
function parseWithRecovery(content) {
  let result = {
    data: {},
    errors: [],
    warnings: []
  };
  
  try {
    result.data = parseSetFile(content);
  } catch (error) {
    result.errors.push(`Fatal parse error: ${error.message}`);
    
    // Attempt partial recovery
    try {
      result.data = parsePartialSetFile(content);
      result.warnings.push('File parsed with errors, some data may be missing');
    } catch {
      result.errors.push('Unable to recover any data from file');
    }
  }
  
  // Validate what we got
  let validation = validateSetFile(content);
  result.errors.push(...validation.errors);
  
  return result;
}
```

**Graceful degradation:**
```javascript
function readValueSafe(filename, groupName, key, defaultValue = null) {
  try {
    return readValue(filename, groupName, key);
  } catch (error) {
    console.warn(`Failed to read ${groupName}.${key}: ${error.message}`);
    return defaultValue;
  }
}
```

---

## 6. Programming Interface Guidelines

### 6.1 Parser Design Principles

**Minimal parser (Q-Set approach):**
- Parse only sections 1-3 of specification
- Built-in parsing, no external libraries
- Single file implementation
- Focus on read-only operations

**Standard parser:**
- Support sections 1-4 of specification
- Include CRUD operations
- Basic validation
- May use helper libraries

**Full-featured parser:**
- Complete specification support
- SetQL query language
- Advanced validation
- Transaction safety
- Reference resolution
- Optional features (runtime calculations, type hints)

### 6.2 API Design Patterns

**Object-oriented approach:**
```javascript
class SetFile {
  constructor(filename) {
    this.filename = filename;
    this.data = this.load();
  }
  
  load() {
    let content = readFile(this.filename);
    return parseSetFile(content);
  }
  
  save() {
    let content = serializeSetFile(this.data);
    writeFile(this.filename, content);
  }
  
  getGroup(name) {
    return this.data.groups[name];
  }
  
  getValue(group, key) {
    return this.data.groups[group]?.[key];
  }
  
  setValue(group, key, value) {
    if (!this.data.groups[group]) {
      this.data.groups[group] = {};
    }
    this.data.groups[group][key] = value;
  }
  
  query(setql) {
    return executeQuery(this.data, setql);
  }
}

// Usage
let config = new SetFile('app.set');
let dbHost = config.getValue('DATABASE', 'Host');
config.setValue('DATABASE', 'Port', '5432');
config.save();
```

**Functional approach:**
```javascript
// Pure functions, immutable data
const SetFile = {
  parse: (content) => parseSetFile(content),
  
  serialize: (data) => serializeSetFile(data),
  
  getGroup: (data, groupName) => 
    data.groups[groupName],
  
  getValue: (data, groupName, key) =>
    data.groups[groupName]?.[key],
  
  setValue: (data, groupName, key, value) => ({
    ...data,
    groups: {
      ...data.groups,
      [groupName]: {
        ...data.groups[groupName],
        [key]: value
      }
    }
  }),
  
  query: (data, setql) => executeQuery(data, setql)
};

// Usage
let content = readFile('app.set');
let data = SetFile.parse(content);
let dbHost = SetFile.getValue(data, 'DATABASE', 'Host');
let updated = SetFile.setValue(data, 'DATABASE', 'Port', '5432');
writeFile('app.set', SetFile.serialize(updated));
```

### 6.3 Caching Strategies

**Lazy loading with cache:**
```javascript
class CachedSetFile {
  constructor(filename) {
    this.filename = filename;
    this.cache = null;
    this.lastModified = null;
  }
  
  getData() {
    let currentModTime = getFileModTime(this.filename);
    
    if (this.cache === null || currentModTime > this.lastModified) {
      this.cache = this.load();
      this.lastModified = currentModTime;
    }
    
    return this.cache;
  }
  
  invalidate() {
    this.cache = null;
  }
  
  getValue(group, key) {
    return this.getData().groups[group]?.[key];
  }
}
```

### 6.4 Stream Processing

**For large files:**
```javascript
function* streamSetGroups(filename) {
  let lines = readFileStream(filename);
  let currentGroup = null;
  let groupLines = [];
  
  for (let line of lines) {
    let trimmed = line.trim();
    
    if (trimmed.match(/^\[[A-Za-z0-9_-]+\]$/)) {
      // New group starting
      if (currentGroup) {
        yield {
          name: currentGroup,
          content: parseGroupLines(groupLines)
        };
      }
      currentGroup = trimmed.substring(1, trimmed.length - 1);
      groupLines = [];
    } else if (trimmed === '' || trimmed === '[EOG]') {
      // Group ending
      if (currentGroup) {
        yield {
          name: currentGroup,
          content: parseGroupLines(groupLines)
        };
        currentGroup = null;
        groupLines = [];
      }
    } else if (currentGroup) {
      groupLines.push(line);
    }
  }
  
  // Handle last group if file doesn't end with EOG
  if (currentGroup && groupLines.length > 0) {
    yield {
      name: currentGroup,
      content: parseGroupLines(groupLines)
    };
  }
}

// Usage
for (let group of streamSetGroups('large.set')) {
  console.log(`Processing ${group.name}...`);
  processGroup(group.content);
}
```

### 6.5 Internationalization Support

**Handling locale settings from [THIS-FILE]:**
```javascript
function parseWithLocale(content) {
  let data = parseSetFile(content);
  let locale = data.groups['THIS-FILE']?.Localize;
  
  if (locale) {
    let [normalization, language, direction] = locale.split('|');
    
    // Apply normalization
    if (normalization === 'NFC') {
      data = normalizeNFC(data);
    }
    
    // Set text direction
    data.metadata = {
      ...data.metadata,
      textDirection: direction || 'LTR',
      language: language || 'en-US'
    };
  }
  
  return data;
}
```

---

## 7. Complete Examples

### 7.1 Simple Configuration File

```
app_config.set

[THIS-FILE]
Version|4.2
Created|2025-12-04
[EOG]

[DATABASE]
Host|localhost
Port|5432
Database|myapp
User|dbuser
Password|[{DB_PASSWORD}]
[EOG]

[APP_SETTINGS]
AppName|My Application
Version|1.0.0
Debug|false
LogLevel|info
[EOG]

[EMAIL]
SMTPHost|smtp.example.com
SMTPPort|587
FromAddress|noreply@example.com
[EOG]

[{DB_PASSWORD}]
encrypted_password_here_base64_encoded
[EOG]

[EOF]
```

### 7.2 Data Table with Field Definitions

```
users.set

[THIS-FILE]
Version|4.2
Delimiters|:[]:{}:|:\:…:!
[EOG]

[USERS]
{id|username|email|role|created_at}
1|alice|alice@example.com|admin|2025-01-15
2|bob|bob@example.com|editor|2025-02-20
3|charlie|charlie@example.com|viewer|2025-03-10
4|diana|diana@example.com|editor|2025-03-15
[EOG]

[USER_PERMISSIONS]
{user_id|resource|actions}
1|all|create!read!update!delete
2|posts|create!read!update!delete
2|comments|read!update
3|posts|read
3|comments|read
4|posts|create!read!update!delete
[EOG]

[EOF]
```

### 7.3 Mixed Content with Text Blocks

```
project.set

[THIS-FILE]
Version|4.2
Created|2025-12-04
[EOG]

[PROJECT_INFO]
Name|Advanced Demo
Version|2.0
Description|[{PROJECT_DESC}]
License|[{LICENSE}]
[EOG]

[DEPENDENCIES]
{package|version|source}
react|18.2.0|npm
typescript|5.0.0|npm
vite|4.3.0|npm
[EOG]

[BUILD_CONFIG]
Target|es2020
OutDir|dist
SourceMap|true
Minify|false
[EOG]

[{PROJECT_DESC}]
This is a demonstration project showcasing the SET file format v4.2.

It includes multiple types of data:
- Configuration settings
- Tabular data with field definitions
- Text blocks for longer content
- Nested arrays using secondary delimiters

The project serves as both documentation and a working example.
[EOG]

[{LICENSE}]
MIT License

Copyright (c) 2025 Example Corp

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
[EOG]

[EOF]
```

### 7.4 Advanced Features Demo

```
advanced_demo.set

[THIS-FILE]
Version|4.2
Delimiters|:[]:{}:|:\:…:!
Created|2025-12-04
[EOG]

[SALES_DATA]
{date|product|quantity|price|::subtotal|::tax|::total}
2025-01-15|Widget A|10|25.00
2025-01-16|Widget B|5|50.00|:::rush_order:true
2025-01-17|Widget C|3|100.00
[EOG]

[CONTACT_INFO]
{id|name|phone|email|address|city|state|zip}
1|Alice Smith|555-1234|alice@example.com|123 Main St|Seattle|WA|98101
2|Bob Jones|555-5678|bob@example.com|…
3|Carol White|555-9012|carol@example.com|456 Oak Ave|Portland|OR|97201
[EOG]

[API_ENDPOINTS]
{name|url|method}
GetUsers|https://api.example.com/users|GET
:!CreateUser!https://api.example.com/users?format=json|include=profile|expand=roles!POST
UpdateUser|https://api.example.com/users/{id}|PUT
DeleteUser|https://api.example.com/users/{id}|DELETE
[EOG]

[NESTED_PERMISSIONS]
{role|modules}
admin|users!posts!comments!settings
editor|posts!comments
author|posts
moderator|comments!users
[EOG]

[README]
[{PROJECT_README}]
[EOG]

[{PROJECT_README}]
# Advanced Demo Project

## Features Demonstrated

1. **Runtime Calculation Pattern (::)**
   - Implementation-specific feature (see Section 2.2)
   - Calculated subtotal, tax, and total fields
   - Not part of core spec, but a common convention

2. **Single-Use Fields (:::)**
   - Per-record metadata without modifying field definition
   - Ad-hoc notes like rush_order flag
   - Useful for one-off data annotations

3. **Ellipsis Shorthand (…)**
   - Sparse data representation
   - Reduces file size when many trailing fields are empty
   - Improves readability

4. **Single-Line Delimiter Override**
   - Complex URLs with query parameters containing pipes
   - Data containing standard delimiter characters
   - Line-specific delimiter changes

5. **Nested Arrays with Secondary Delimiter (!)**
   - Arrays within fields using secondary delimiter
   - Permission sets, tag lists, hierarchical data
   - Default secondary delimiter is !

6. **Text Block References**
   - Multi-line content stored separately
   - Referenced from data fields
   - Keeps data clean and organized

## Usage

Parse this file with a SET file parser that supports v4.2 features.
For runtime calculations, your parser must implement the :: pattern.

Nested arrays require secondary delimiter support (section 1.3.3).
[EOG]

[EOF]
```

### 7.5 Environment-Specific Configuration

```
env_config.set

[THIS-FILE]
Version|4.2
Environment|production
[EOG]

[DATABASE_PRODUCTION]
Host|prod-db-01.example.com
Port|5432
Database|myapp_prod
User|prod_user
Password|[{DB_PROD_PASSWORD}]
PoolSize|50
Timeout|30
SSL|true
[EOG]

[DATABASE_STAGING]
Host|staging-db.example.com
Port|5432
Database|myapp_staging
User|staging_user
Password|[{DB_STAGING_PASSWORD}]
PoolSize|20
Timeout|30
SSL|true
[EOG]

[DATABASE_DEVELOPMENT]
Host|localhost
Port|5432
Database|myapp_dev
User|dev_user
Password|dev_password
PoolSize|5
Timeout|60
SSL|false
[EOG]

[CACHE_PRODUCTION]
Provider|redis
Host|prod-cache-01.example.com
Port|6379
TTL|3600
MaxMemory|2GB
[EOG]

[CACHE_STAGING]
Provider|redis
Host|staging-cache.example.com
Port|6379
TTL|1800
MaxMemory|1GB
[EOG]

[CACHE_DEVELOPMENT]
Provider|memory
TTL|300
MaxMemory|100MB
[EOG]

[{DB_PROD_PASSWORD}]
<encrypted_password_here>
[EOG]

[{DB_STAGING_PASSWORD}]
<encrypted_password_here>
[EOG]

[EOF]
```

### 7.6 Embedded SetTag Example

**config.js with embedded SET configuration:**
```javascript
// {SETTAG:AppConfig}
// [SETTINGS]
// AppName|My JavaScript App
// Version|1.0.0
// Debug|false
// {/SETTAG/}

const config = extractSetTag('config.js', 'AppConfig');
console.log(config.SETTINGS.AppName); // "My JavaScript App"
```

**index.html with metadata:**
```html
<!DOCTYPE html>
<html>
<head>
  <!-- {SETTAG:PageMeta}
  [META]
  Title|My Page
  Author|John Doe
  Keywords|demo,example,set-file
  Description|[{PAGE_DESC}]
  
  [{PAGE_DESC}]
  This is a demo page showing how SET files can be
  embedded in HTML documents using comment tags.
  {/SETTAG/} -->
  
  <title>My Page</title>
</head>
<body>
  <h1>Demo Page</h1>
</body>
</html>
```

---

## 8. Version History & Migration

### Version 4.2 (December 2025) - Refinements & Clarifications

**Changes from v4.0 to v4.2:**

1. **Enhanced Delimiter Documentation**
   - Clarified delimiter positions in `Delimiters` setting
   - Standard format: `:[]:{}:|:\:…:!`
   - Positions: preamble, group start, group end, field, text group start, text group end, ellipsis, secondary

2. **Secondary Delimiter Explicit Support**
   - Added `!` as default secondary delimiter for nested arrays
   - Documented in [THIS-FILE] Delimiters setting
   - Enables nested data structures within fields

3. **Single-Line Delimiter Override Clarification**
   - Syntax: Line starts with preamble delimiter + single-use delimiter
   - Example: `:!URL!https://example.com/api?param=value|other=value`
   - Clearer documentation of use cases

4. **Text Group Naming Consistency**
   - Standardized on `[{TEXT-GROUP}]` syntax throughout documentation
   - Consistent usage of curly braces in group names
   - Clearer distinction from regular groups

5. **Documentation Improvements**
   - Better examples for all advanced features
   - Clearer explanation of implementation flexibility
   - Enhanced migration guidance

**Migration from v4.0 to v4.2:**

Files created with v4.0 are **fully compatible** with v4.2 parsers. No file changes required.

If using delimiters, update `[THIS-FILE]` to include secondary delimiter explicitly:

v4.0 format:
```
[THIS-FILE]
Delimiters|:[]:{}:|:\:…
[EOG]
```

v4.2 format (recommended):
```
[THIS-FILE]
Delimiters|:[]:{}:|:\:…:!
[EOG]
```

This makes the secondary delimiter explicit and enables better nested array support.

---

### Version 4.0 (November 2025) - Major Simplification

**Philosophy Change:**
Version 4.0 represented a fundamental shift toward simplicity and implementation flexibility. The format was simplified while maintaining backward compatibility with most v3.x files.

**Major Changes:**

1. **Removed Mandatory Preamble**
   - v3.x: Required 4-7 line preamble with specific format
   - v4.0: Optional `[THIS-FILE]` group for configuration
   - Benefit: Simpler files, easier to get started

2. **Eliminated Group Type Distinction**
   - v3.x: `[=KEYVALUE=]` syntax for key-value groups
   - v4.0: Just `[GROUPNAME]` - can contain positional or key-value data
   - Benefit: Less syntax to remember, cleaner files

3. **Removed Comment Block Syntax**
   - v3.x: `{|[COMMENT]|}` ... `{|[/COMMENT]|}`
   - v4.0: Text outside groups is inherently a comment
   - Benefit: Simpler, more natural documentation

4. **Added Features:**
   - Single-line delimiter override: `:!field!field!field`
   - Implicit EOG via empty lines (explicit `[EOG]` still allowed)
   - Clearer escape sequence rules (minimal: just `\|` and `\\`)

5. **Simplified Escape Sequences**
   - v3.x: Required escaping `[`, `]`, `{`, `}`, space markers
   - v4.0: Only escape delimiter `\|` and backslash `\\`
   - Benefit: Less escaping needed, more readable

**Migration from v3.x to v4.0:**

**Step 1: Update Preamble**

v3.x format:
```
filename.set
UTF-8
:[]:{}:|:\:…:
NFC|en-US|LTR


VERSION: 3.3
```

v4.0 format:
```
filename.set

[THIS-FILE]
Version|4.0
Delimiters|:[]:{}:|:\:…:
Encode|UTF-8
Localize|NFC|en-US|LTR
[EOG]
```

**Step 2: Update Group Names**

v3.x format:
```
[=SETTINGS=]
Key|Value
[EOG]
```

v4.0 format:
```
[SETTINGS]
Key|Value
[EOG]
```

Simply remove the `=` signs from group names.

**Step 3: Replace Comment Blocks**

v3.x format:
```
{|[NOTE]|}
This is a comment
{|[/NOTE]|}

[DATA]
```

v4.0 format:
```
This is a comment

[DATA]
```

Or use unreferenced text blocks:
```
[{NOTE}]
This is a comment
[EOG]

[DATA]
```

**Step 4: Simplify Escape Sequences**

v3.x: Required escaping brackets and braces
```
Expression|\[value\] in \{range\}
```

v4.0: Only escape delimiter and backslash
```
Expression|[value] in {range}
```

Unless the line starts with `[`, brackets don't need escaping.

**Automated Migration Script:**

```python
def migrate_v3_to_v4(v3_filename, v4_filename):
    lines = read_file(v3_filename)
    output = []
    
    # Convert preamble to [THIS-FILE] group
    if is_v3_preamble(lines[0:7]):
        output.append(lines[0])  # filename
        output.append("")         # blank line
        output.append("[THIS-FILE]")
        output.append(f"Version|4.0")
        if lines[1].strip():
            output.append(f"Encode|{lines[1]}")
        if lines[2].strip():
            output.append(f"Delimiters|{lines[2]}")
        if lines[3].strip():
            output.append(f"Localize|{lines[3]}")
        output.append("[EOG]")
        output.append("")
        lines = lines[7:]  # Skip preamble
    
    # Convert group names
    for line in lines:
        # Remove [=NAME=] syntax
        line = re.sub(r'\[=(.+)=\]', r'[\1]', line)
        
        # Remove comment blocks
        if '{|[' in line and '|}'  in line:
            continue  # Skip comment block markers
            
        output.append(line)
    
    write_file(v4_filename, output)
```

**Backward Compatibility:**

v4.0+ parsers **can** read most v3.x files with these caveats:
- Preamble must be converted to `[THIS-FILE]` group
- Comment blocks are not supported (but can be converted to text outside groups)
- `[=NAME=]` syntax works but is deprecated

v3.x parsers **cannot** reliably read v4.0+ files that use:
- `[THIS-FILE]` group instead of preamble
- Text outside groups as comments
- Single-line delimiter override

---

### Version 3.3 (November 2025)

- Clarified progressive preamble definition
- Standardized group naming rules
- Updated `[EOG]` and `[EOF]` markers to optional
- Enhanced documentation

### Version 3.2 (November 2025)

- Added key-value groups `[=NAME=]`
- Added text block groups `[{NAME}]`
- Added text block reference system
- Enhanced validation rules

### Version 3.0 (September 2025)

- Added special functions (ellipses, single-use fields `:::`)
- Enhanced internationalization support
- Improved escape character handling
- Added SetQL query language

### Version 2.0

- Core format specification
- Escape sequences
- Comment blocks
- SetTag extensions

---

### Migration Best Practices

**When to migrate:**
- Creating new SET files → Use v4.2
- Simple v3.x files → Easy to migrate to v4.x
- Complex v3.x files with many comment blocks → Evaluate benefits
- Production systems → Test thoroughly before migration
- v4.0 to v4.2 → No migration needed, fully compatible

**Testing migration:**
1. Back up original files
2. Run migration script
3. Parse both versions with v4.2 parser
4. Compare data structures
5. Validate all references resolve
6. Test with your application

**Gradual migration:**
- Migrate configuration files first (simplest)
- Then data files
- Finally, complex files with many text blocks
- Keep old version files until new versions are validated

---

## License

[![CC BY 4.0][cc-by-shield]][cc-by]

This guide is licensed under the **Creative Commons Attribution 4.0 International License (CC BY 4.0)**.

**Copyright (c) 2025 Kirk Siqveland**

You are free to:
- **Share** — copy and redistribute the material in any medium or format
- **Adapt** — remix, transform, and build upon the material for any purpose, even commercially

Under the following terms:
- **Attribution** — You must give appropriate credit, provide a link to the license, and indicate if changes were made

Full license text: https://creativecommons.org/licenses/by/4.0/

Implementations of this specification may use any license of the implementer's choosing.

[cc-by]: https://creativecommons.org/licenses/by/4.0/
[cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg

---

*End of SET File Implementation Guide v4.2*

**Questions or feedback?**  
Visit: https://github.com/kirksiqveland/setfile

**License:**  
Creative Commons Attribution 4.0 International (CC BY 4.0)  
Copyright (c) 2025 Kirk Siqveland
