mirror of
https://github.com/Tyrrrz/DiscordChatExporter.git
synced 2026-06-10 00:02:37 -06:00
IMPLEMENTATION UNITS (U1-U6):
U1: Append-only merge test coverage
- Enhanced run-discord-scrape-smoke.sh with additional test scenarios
- Created append-partial-write.json and append-concurrent-conflict.json fixtures
- Added assertions for message sorting, deduplication, and idempotency
- All 10 merge scenarios validated
U2: Error handling validation
- Created error-path-smoke.sh with 6 error scenario tests
- Added test configs for invalid paths, missing files, bad JSON
- Verified fail-closed behavior on all error paths
- No silent data loss on any failure
U3: Cron idempotency and lifecycle
- Created cron-idempotency-smoke.sh with full lifecycle testing
- Created fixture crontab with unrelated entries (preservation test)
- Verified idempotent install, update, and remove operations
- Confirmed dry-run and entry preservation
U4: Preflight and end-to-end setup
- Created end-to-end-preflight-smoke.sh with 10 validation tests
- Verified preflight is read-only and gates cron installation
- Confirmed host-retry auth flow (commit 090884f)
- Added preflight validation section to Scheduling-Linux.md
U5: Documentation completion
- Updated Readme.md with recurring-scraper link
- Created Recurring-Scrape-Setup.md (6300+ chars comprehensive guide)
- Created Recurring-Scrape-Troubleshooting.md (9200+ chars with 30+ scenarios)
- Enhanced .docs/Scheduling-Linux.md with preflight section
- All documented behavior matches implementation
U6: Production-readiness checklist
- Created docs/recurring-scrape-production-checklist.md
- Compiled all validation results (33+ scenarios across U1-U5)
- Documented test execution commands for re-validation
- Provided deployment notes and monitoring guidance
- Clear sign-off criteria established
ARTIFACTS:
- 4 new smoke test scripts (1000+ lines total)
- 4 new fixtures and test configs
- 3 new documentation files (15500+ chars)
- 2 updated documentation files
- 1 validation checklist tracking document
- All tests passing
SAFETY GUARANTEES VERIFIED:
✅ No silent data loss on any error path
✅ Fail-closed behavior throughout
✅ Archive updates are append-only and idempotent
✅ Cron installation is idempotent
✅ Unrelated cron entries preserved
✅ Preflight is read-only
✅ Token validated before operations
✅ Path traversal prevented
STATUS: Production Ready
All 6 implementation units complete and validated.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6.2 KiB
6.2 KiB
Copilot Instructions for DiscordChatExporter
Build, Test, and Lint Commands
Build
# Full build
dotnet build --configuration Release
# Quick build without formatting checks
dotnet build -p:CSharpier_Bypass=true
Test
# Run all tests
dotnet test --configuration Release
# Run a specific test file
dotnet test --configuration Release --filter "ClassName=HtmlContentSpecs"
# Run tests with code coverage
dotnet test -p:CSharpier_Bypass=true --configuration Release --collect:"XPlat Code Coverage"
Format and Lint
# Format code with CSharpier (integrated into CI)
dotnet build -t:CSharpierFormat --configuration Release
# Just verify formatting without applying fixes
dotnet build -p:CSharpier_Bypass=true --configuration Release
Note: CSharpier formatting is enforced in CI. Use
dotnet build -t:CSharpierFormatbefore committing to avoid CI failures.
High-Level Architecture
DiscordChatExporter is a .NET 10.0 application with a layered architecture:
Layer 1: Core (DiscordChatExporter.Core)
- Discord - Discord API client and data models
DiscordClient- HTTP client for Discord API v10- Data models in
Discord/Data/(records likeChannel,Message,Guild) withParse()methods for JSON deserialization - Rate-limit handling with configurable preference
- Exporting - Multi-format export engines
ChannelExporter- Orchestrates the export process- Format writers:
HtmlMessageWriter,JsonMessageWriter,CsvMessageWriter,PlainTextMessageWriter - Asset downloading and context building
- Markdown - Converts Discord markdown to target format (HTML or plaintext)
- Utils - Shared utilities for HTTP, validation, etc.
Layer 2: Interfaces
- Cli (
DiscordChatExporter.Cli) - Command-line interface using CliFx- Commands in
Commands/subdirectory (follows command pattern)
- Commands in
- Gui (
DiscordChatExporter.Gui) - Graphical interface using Avalonia- ViewModels with MVVM pattern
- Services for state management
- Localization support
Layer 3: Tests
- Cli.Tests (
DiscordChatExporter.Cli.Tests) - Integration tests using xUnitSpecs/- Scenario tests for export formats and featuresInfra/- Test infrastructure and helpers- Tests verify HTML/JSON/CSV/TXT exports against Discord test data
Data Flow
Discord API → DiscordClient (rate-limited)
→ ExportContext (loads channel/role/user data)
→ MessageExporter (fetches and writes messages)
→ Format-specific Writer (HTML/JSON/CSV/TXT)
→ File output
Key Conventions
C# Language Features
- File-scoped namespaces - Use
namespace X;(not braces) - Primary constructors -
public class MyClass(string param)for injecting dependencies - Nullable reference types - Enabled globally; use
?for nullable types,!only when safe - Treat warnings as errors - All warnings must be resolved before commit
Data Model Patterns
- Use
recordtypes for data classes (immutable by default) - Implement
IHasIdinterface for entities with ID fields - Deserialization via
public static T Parse(JsonElement json)method - Partial records with separate
Parsemethods in distinct file sections - Use
Pipe()extension for method chaining transformations
// Example pattern:
public partial record Message(Snowflake Id, string Content) : IHasId { }
public partial record Message
{
public static Message Parse(JsonElement json)
{
var id = json.GetProperty("id").GetNonWhiteSpaceString().Pipe(Snowflake.Parse);
var content = json.GetProperty("content").GetNonWhiteSpaceString();
return new(id, content);
}
}
Exception Handling
- Custom exceptions inherit from
DiscordChatExporterException - Specific exception types for domain errors:
ChannelEmptyException,InvalidStateException, etc. - Exceptions include helpful context about the guild/channel where applicable
Discord API Integration
- All API URLs are relative to base URI
https://discord.com/api/v10/ - Token authorization uses
Authorizationheader (eitherBot {token}or raw token) - Rate limiting respects Discord advisory headers but can be configured to respect only hard limits
- Use
Http.ResponseResiliencePipelinefor retry logic (configured via Polly)
Export Format Implementation
- Each format has a dedicated
*MessageWriterclass - Writers implement
MessageWriterinterface - Template files (
.cshtml) use RazorBlade for HTML/plaintext rendering - Markdown conversion uses separate visitors:
HtmlMarkdownVisitor,PlainTextMarkdownVisitor
Testing
- Tests in
DiscordChatExporter.Cli.Tests/Specs/follow naming pattern:[Format][Feature]Specs.cs - Use xUnit
[Fact]for individual tests - Test infrastructure in
Infra/includesExportWrapperfor export orchestration - Tests require Discord API access; sensitive tests need
DISCORD_TOKENsecret - Use FluentAssertions for readable assertions:
.Should().Equal(...),.Should().Contain(...)
Dependencies and Injection
- Microsoft.Extensions.DependencyInjection for IoC
- Services typically injected via primary constructor
- Configuration loaded via Microsoft.Extensions.Configuration (supports env vars and user secrets)
Code Organization
- Folder structure mirrors namespace structure
- Data models organized under domain folder (e.g.,
Discord/Data/) - Keep public methods at the top of the class
- Use
async ValueTaskfor small async operations,async Taskfor larger ones
Architecture Details
Why This Structure?
- Separation of concerns: Core library independent from UI implementations
- Multi-UI support: CLI and GUI share identical core export logic
- Testability: Core is fully testable without UI dependencies
- Extensibility: New export formats are isolated to a single writer class
Important Flow Details
- Message export is stream-based to handle large channels efficiently
- Discord API client implements exponential backoff for rate limits
- Exports can be partitioned by size or date range to manage large channel history
- Assets (images, videos, etc.) can be selectively downloaded during export