PHP Classes

File: docs/async-pagination.md

Recommend this page to a friend!
  Packages of Saurabh Gayali   Uniprot PHP Class   docs/async-pagination.md   Download  
File: docs/async-pagination.md
Role: Auxiliary data
Content type: text/markdown
Description: Auxiliary data
Class: Uniprot PHP Class
Access a UniProt Protein sequence REST API.
Author: By
Last change:
Date: 2 months ago
Size: 4,638 bytes
 

Contents

Class file image Download

Pagination Implementation Guide

Architecture

The UniProt PHP Class implements cursor-based pagination using the UniProt REST API's Link header for navigation.

Key Concepts

User's Page Size (NOT 500) - User selects page size: 10, 20, or 50 results per page - Each API call fetches exactly the number of results user requested - We never query 500 results automatically

Internal Optimization (Hidden from User) - Internally, we use batch size of 500 for efficient cursor navigation - This reduces the number of API calls needed to reach a specific offset - User still sees their requested page size in results

Cursor-Based Pagination - API provides Link header with rel="next" URL - We follow these cursors to navigate through batches - Each cursor points to the next batch of 500 results

Algorithm

When requesting offset=1390, pageSize=10:

1. Calculate which 500-result batch contains offset 1390
   completeBatches = floor(1390 / 500) = 2
   offsetInBatch = 1390 % 500 = 390

2. Make first API call: size=500, cursor position 0
   
3. Skip 2 complete batches by following cursors from Link headers
   - Call 1: Gets batch 1 (results 0-499)
   - Extract cursor for next batch from Link header
   - Wait 0.5 seconds (rate limiting)
   - Call 2: Gets batch 2 (results 500-999)
   - Extract cursor for next batch from Link header
   - Wait 0.5 seconds

4. Fetch batch 3 (results 1000-1499)
   - Contains offset 1390
   - Extract starting at position 390

5. Return results 390-399 (10 results from pageSize)

Implementation Details

getPaginatedResults($query, $offset, $pageSize)

Parameters: - $query (string): UniProt search query - $offset (int): Starting position (0-based) - $pageSize (int): Results per page (10, 20, or 50)

Returns:

[
    'results' => [...],          // Array of entry objects
    'total' => 20420,            // Total results available
    'offset' => 1390,            // Requested offset
    'pageSize' => 10,            // Results returned
    'currentPage' => 140,        // Current page number
    'totalPages' => 2042,        // Total pages
    'previousOffset' => 1380,    // Previous page offset (or null)
    'nextOffset' => 1400,        // Next page offset (or null)
    'pageLinks' => [1400, 1410], // Array of suggested next offsets
]

Async Search Implementation

Server-Side (search_test.php)

Receives URL parameters: - query - Search query string - offset - Starting position - size - Results per page (10, 20, or 50)

Example: search_test.php?query=organism_id:9606&offset=0&size=10

Returns JSON response with results and pageLinks.

Client-Side (search_async.html)

HTML/JavaScript interface that: 1. Calls search_test.php asynchronously 2. Displays results 3. Shows pagination links 4. Allows navigation to next/previous pages

Rate Limiting

  • 0.5 second delay between consecutive API calls
  • Prevents API rate limiting on large result sets
  • Automatic with `usleep(500000)` in getPaginatedResults

Error Handling

  • 30 second timeout per API request (increased from 20s default)
  • 10 second connection timeout
  • Empty results returned on errors instead of throwing exceptions
  • Graceful fallback if Link header is missing

Usage Examples

PHP Backend

$search = new UniProtSearch();
$result = $search->getPaginatedResults(
    'organism_id:9606 AND reviewed:true',
    1390,  // Page 140 with pageSize=10
    10
);

foreach ($result['results'] as $entry) {
    echo $entry['primaryAccession'] . "\n";
}

// Navigate to next page
if ($result['nextOffset'] !== null) {
    $nextPage = $search->getPaginatedResults(
        'organism_id:9606 AND reviewed:true',
        $result['nextOffset'],
        10
    );
}

JavaScript/Async

// Fetch page asynchronously
const response = await fetch(
    'search_test.php?query=organism_id:9606&offset=0&size=10'
);
const data = await response.json();

// Use pageLinks to build navigation
data.pageLinks.forEach(offset => {
    const pageNum = Math.floor(offset / 10) + 1;
    console.log(`Page ${pageNum} available at offset ${offset}`);
});

Performance Notes

  • Page 1 (offset 0): ~1-2 seconds
  • Page 50 (offset 490): ~2-3 seconds (1 batch skip + 0.5s wait)
  • Page 140 (offset 1390): ~5-8 seconds (2 batch skips + 1s wait)
  • Page 204 (offset 2030): ~10+ seconds (4 batch skips + 2s wait)

The time increases as we skip more batches, which is expected due to: 1. More API calls to follow cursors 2. Rate limiting delays between calls 3. Larger result sets (500 results per batch) take longer to process