[Bug]: ISO 8859/1 messages display corrupted characters (encoding mismatch)

### Affected Component

HL7 Parser (message parsing, delimiter detection)

### Bug Description

Messages encoded in ISO 8859-1 (Latin-1) are displayed with broken/corrupted characters in HL7 Forge. Accented characters (`Ã¤`, `Ã¶`, `Ã¼`, `Ã`, `Ã©`, etc.) appear as `?` or as the Unicode replacement character (`Ã¢â¬`, `ÃÂ¶`, etc.) instead of their correct glyphs.

**Example:** A patient name like `ThÃ¶ni` is rendered as `ThÃÂ¶ni` or `Th?ni`.

The sending system correctly declares the encoding in MSH-18 (`8859/1`), but HL7 Forge ignores this field entirely.

---

## Root Cause

**File:** `src/mllp.rs`, line 232

```rust
let message = String::from_utf8_lossy(message_bytes).to_string();
```

`String::from_utf8_lossy` unconditionally assumes UTF-8. ISO 8859-1 uses byte values `0x80â0xFF` for extended Latin characters â these are not valid UTF-8 sequences, so they are silently replaced with `U+FFFD` before the message ever reaches the parser or the UI.

MSH-18 is never read. The parser (`src/hl7/parser.rs`) operates on the already-corrupted string. The `Hl7Message` struct (`src/hl7/types.rs`) has no `charset` field. There is no code path anywhere that inspects or respects MSH-18.

---

## Affected Code Paths

| File | Line | Issue |
|------|------|-------|
| `src/mllp.rs` | 232 | Hard-coded `from_utf8_lossy()` â always assumes UTF-8 |
| `src/hl7/parser.rs` | â | Receives already-corrupted `&str`; MSH-18 never extracted |
| `src/hl7/types.rs` | â | `Hl7Message` has no `charset`/`encoding` field |

---

## Proposed Fix

1. **Two-pass decode in `extract_mllp_frame`** (`src/mllp.rs`):
   - Do a minimal ASCII-only scan of the raw bytes to locate and read the MSH-18 field value (safe, since all HL7 delimiters and segment names are ASCII).
   - Select decoder based on MSH-18: `8859/1` â Latin-1, `UTF-8`/absent â current behavior.
   - Recommended crate: [`encoding_rs`](https://crates.io/crates/encoding_rs)

2. **Add `charset` field to `Hl7Message`** (`src/hl7/types.rs`):
   ```rust
   pub charset: Option<String>
   ```

3. **Surface charset in the UI** â show detected encoding in the message header row.

### Steps to Reproduce

1. Start hl7-forge with default settings.
2. Send an MLLP message with `MSH-18` set to `8859/1` containing Latin-1 extended characters (e.g. patient name `MÃ¼ller` or `ThÃ¶ni`).
3. Open the Web UI and click on the received message.
4. Observe corrupted characters in both the **Segments** tab and **Raw** tab.

### Expected Behavior

Characters like `Ã¤`, `Ã¶`, `Ã¼`, `Ã` display correctly based on the encoding declared in MSH-18.

### Actual Behavior

Extended Latin characters are corrupted â displayed as `Ã¢â¬`, `ÃÂ¶`, `ÃÂ¤`, or `?` depending on the byte sequence.

### HL7 Message Sample

```text
MSH|^~\&|XXX RIS|XXX|Test-ORU|XXX|20260309102241||ORU^R01|137975750|P|2.4|||AL|NE||8859/1
```

### Operating System

Windows

### HL7 Forge Version

v0.4.0

### Deployment Method

Pre-built binary (.exe / release)

### Additional Context

## HL7 Standard Reference

**MSH-18 (Character Set)** â HL7 v2.x Table 0211

Common values: `ASCII`, `8859/1`, `8859/2`, `UNICODE UTF-8`
Many real-world EMR/RIS systems send ISO 8859-1 data and declare it in MSH-18.


File	Line	Issue
`src/mllp.rs`	232	Hard-coded `from_utf8_lossy()` â�� always assumes UTF-8
`src/hl7/parser.rs`	â��	Receives already-corrupted `&str`; MSH-18 never extracted
`src/hl7/types.rs`	â��	`Hl7Message` has no `charset`/`encoding` field

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: ISO 8859/1 messages display corrupted characters (encoding mismatch) #78

Affected Component

Bug Description

Root Cause

Affected Code Paths

Proposed Fix

Steps to Reproduce

Expected Behavior

Actual Behavior

HL7 Message Sample

Operating System

HL7 Forge Version

Deployment Method

Additional Context

HL7 Standard Reference

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Bug]: ISO 8859/1 messages display corrupted characters (encoding mismatch) #78

Description

Affected Component

Bug Description

Root Cause

Affected Code Paths

Proposed Fix

Steps to Reproduce

Expected Behavior

Actual Behavior

HL7 Message Sample

Operating System

HL7 Forge Version

Deployment Method

Additional Context

HL7 Standard Reference

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions