Skip to content

fix: handle multi-type TheoryData<T1, T2> in xUnit migration#5001

Merged
thomhurst merged 3 commits intomainfrom
fix/theorydata-multitype-migration
Feb 25, 2026
Merged

fix: handle multi-type TheoryData<T1, T2> in xUnit migration#5001
thomhurst merged 3 commits intomainfrom
fix/theorydata-multitype-migration

Conversation

@thomhurst
Copy link
Owner

Summary

Fixes #5000

  • TheoryData<T1, T2, ...> with multiple type arguments was incorrectly converted to IEnumerable<T1, T2> (invalid — IEnumerable<T> only takes one type parameter) and T1[] (losing subsequent types)
  • Now correctly converts to IEnumerable<(T1, T2, ...)> using tuples, with proper (T1, T2)[] array creation and { val1, val2 } initializer expressions converted to (val1, val2) tuple expressions
  • Single-type TheoryData<T> continues to work as before (IEnumerable<T> + T[])

Example

// Before (xUnit):
public static readonly TheoryData<string, int> Items = new()
{
    { "a", 1 },
    { "b", 2 }
};

// After (TUnit) — now correct:
public static readonly IEnumerable<(string, int)> Items = new (string, int)[]
{
    ("a", 1),
    ("b", 2)
};

Test plan

  • Added TheoryData_MultiType_Is_Flagged test — verifies analyzer detects multi-type TheoryData
  • Added TheoryData_MultiType_Can_Be_Converted test — verifies code fix produces correct tuple-based output
  • All 73 existing xUnit migration tests continue to pass

TheoryData<T1, T2> was incorrectly converted to IEnumerable<T1, T2> (invalid)
with a T1[] array (losing other types). Now correctly converts to
IEnumerable<(T1, T2)> with tuple array and tuple initializer expressions.
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Summary

This is a well-targeted bug fix that correctly addresses the invalid code generation for multi-type . The approach is sound and the implementation follows existing patterns in the codebase.

What Works Well

  • Correct fix for the root cause: The previous code did TypeArgumentList.Arguments.FirstOrDefault() and joined all args into IEnumerable<T1, T2> (invalid), while this correctly folds them into IEnumerable<(T1, T2)> using proper tuple syntax.
  • Roslyn-first approach for type declaration: In TransformTheoryData, the type declaration transformation uses SyntaxFactory.TupleType constructed from the actual syntax nodes rather than parsing a string — this is the right way to build Roslyn syntax.
  • ConvertToTupleExpression is clean: The helper correctly guards on SyntaxKind.ComplexElementInitializerExpression (the { val1, val2 } syntax kind) and passes non-matching expressions through unchanged.
  • Tests follow established conventions: The _Is_Flagged test using using System; (no xUnit using) is consistent with the existing TheoryData_Is_Flagged test.

Concerns

1. Inconsistent strategy for vs. type declaration

In XUnitTwoPhaseAnalyzer.cs, the elementType for multi-type is computed as a plain string:

var elementType = isMultiType
    ? $"({string.Join(", ", typeArgs.Select(t => t.ToString()))})"
    : typeArgs[0].ToString();

This string is then consumed in MigrationTransformer.cs via SyntaxFactory.ParseTypeName(conversion.ElementType) for the array type ((string, int)[]). Meanwhile, the IEnumerable<(string, int)> type declaration rebuilds the tuple from scratch using SyntaxFactory.TupleType from the original generic syntax.

This dual strategy works, but it means that if the tuple element types carry trivia or annotations (e.g., nullable annotations, fully-qualified names with spaces), the ParseTypeName path strips that context. Consider either:

  • Using ParseTypeName consistently for both paths, or
  • Stashing the actual TypeSyntax list in TheoryDataConversion and using it for the array creation too (avoiding the string detour entirely)

2. Missing test coverage for 3+ type arguments

xUnit's TheoryData supports up to 10 generic type parameters. The implementation handles N types generically (via .Select()), but the tests only cover the 2-type case. A test with 3 types would provide confidence that the separator handling and tuple expression construction both work correctly at higher arities:

// e.g., TheoryData<string, int, bool> → IEnumerable<(string, int, bool)>
// { "a", 1, true } → ("a", 1, true)

3. is derivable from

IsMultiType is always ElementType.StartsWith("(") (or typeArgs.Count > 1). It's not a design flaw to store it explicitly (avoids fragile string inspection), but it creates a redundant invariant to maintain. Consider renaming the comment on ElementType to make the coupling explicit, or alternatively store the raw IReadOnlyList<string> ElementTypes and compute IsMultiType as a derived property — this would also eliminate the string-join approach for array type construction.

Verdict

The fix is correct, minimal, and well-tested for the common case. The issues above are non-blocking improvement opportunities rather than bugs. The 3+ type argument test gap is the most actionable suggestion if you want to harden the migration tool.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Summary

This is a well-targeted bug fix that correctly addresses the invalid code generation for multi-type TheoryData<T1, T2, ...>. The approach is sound and the implementation follows existing patterns in the codebase.

What Works Well

  • Correct fix for the root cause: The previous code did TypeArgumentList.Arguments.FirstOrDefault() and joined all args into IEnumerable<T1, T2> (invalid), while this correctly folds them into IEnumerable<(T1, T2)> using proper tuple syntax.
  • Roslyn-first approach for type declaration: In TransformTheoryData, the type declaration transformation uses SyntaxFactory.TupleType constructed from the actual syntax nodes rather than parsing a string — this is the right way to build Roslyn syntax.
  • ConvertToTupleExpression is clean: The helper correctly guards on SyntaxKind.ComplexElementInitializerExpression (the { val1, val2 } syntax kind) and passes non-matching expressions through unchanged.
  • Tests follow established conventions: The _Is_Flagged test using using System; (no xUnit using) is consistent with the existing TheoryData_Is_Flagged test.

Concerns

1. Inconsistent strategy for ElementType vs. type declaration

In XUnitTwoPhaseAnalyzer.cs, the elementType for multi-type is computed as a plain string — then consumed via SyntaxFactory.ParseTypeName(conversion.ElementType) for the array type ((string, int)[]). Meanwhile, the IEnumerable<(string, int)> type declaration rebuilds the tuple from scratch using SyntaxFactory.TupleType from the original generic syntax nodes.

This dual strategy works, but the ParseTypeName path strips trivia and context that the Roslyn-first path preserves. Consider either using ParseTypeName consistently for both paths, or storing the actual TypeSyntax nodes in TheoryDataConversion so the array creation can use them too (avoiding the string roundtrip entirely).

2. Missing test coverage for 3+ type arguments

xUnit's TheoryData supports up to 10 generic type parameters. The implementation handles N types generically (via .Select()), but the tests only cover the 2-type case. A test with 3 types would provide confidence that the separator handling and tuple expression construction both work correctly at higher arities:

// e.g., TheoryData<string, int, bool> -> IEnumerable<(string, int, bool)>
// { "a", 1, true } -> ("a", 1, true)

3. IsMultiType creates a redundant invariant

IsMultiType is always equivalent to typeArgs.Count > 1. Storing it explicitly avoids fragile string inspection, but it creates a redundant invariant to keep in sync. A cleaner alternative: store IReadOnlyList<string> ElementTypes and compute IsMultiType as a derived property — this would eliminate the string-join approach for ElementType construction and make the data model more honest.

Verdict

The fix is correct, minimal, and well-tested for the common case. The issues above are non-blocking improvement opportunities. The 3+ type argument test is the most actionable suggestion to further harden the migration tool.

Addresses code review feedback to verify separator handling and tuple
construction work correctly at higher arities (TheoryData<string, int, bool>).
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Summary

This is a well-targeted, correct bug fix. The previous implementation took only , producing the invalid type syntax. The fix correctly folds multiple type arguments into a C# value tuple: . The implementation is clean and follows existing patterns in the codebase.


What Works Well

Correct fix for the root cause. The original code silently discarded all type arguments beyond the first, generating code that would not even compile for consumers. This PR makes it emit valid, idiomatic C#.

Roslyn-first approach for the type declaration. In TransformTheoryData, the IEnumerable<(T1, T2)> type is built properly using SyntaxFactory.TupleType + TupleElement from the actual TypeArgumentList syntax nodes, rather than round-tripping through a string. This preserves trivia and annotations on the type arguments.

ConvertToTupleExpression is safe. The guard on SyntaxKind.ComplexElementInitializerExpression is exactly right: { val1, val2 } inside a collection initializer is always a ComplexElementInitializerExpression, and single-value entries pass through unchanged.

Separator reuse is correct. In the outer list transformation, expressions.GetSeparators() is passed to the new SeparatedList — this correctly reuses the existing commas between items since the item count is unchanged.

Test coverage is solid. Three new tests are added: a flag-only test (property getter form), a 2-type code-fix test, and a 3-type code-fix test. The 3-type test is particularly valuable because it exercises the .Select()-based N-arity path for both the tuple type construction and the ConvertToTupleExpression helper.


Minor Concerns

1. Inconsistent strategy: string round-trip for array type vs. Roslyn nodes for enumerable type

In XUnitTwoPhaseAnalyzer.cs, ElementType for the multi-type case is computed as a plain string:

var elementType = isMultiType
    ? $"({string.Join(", ", typeArgs.Select(t => t.ToString()))})"\n    : typeArgs[0].ToString();

This string is then consumed via SyntaxFactory.ParseTypeName(conversion.ElementType) to build the array element type ((string, int)[]). Meanwhile, the IEnumerable<(T1, T2)> type is built using proper Roslyn syntax nodes.

The string path works in practice, but ParseTypeName on a string like "(string, int)" will normalize formatting and discard any trivia or annotations the original type arguments carried (e.g., nullable reference type annotations: string?, fully-qualified names with internal whitespace). In contrast, the Roslyn-first path for the enumerable type preserves those. This inconsistency is low-risk for the common cases tested, but worth noting for correctness under edge cases like TheoryData<string?, int>.

A cleaner approach would be to either:

  • Store the actual TypeArgumentListSyntax.Arguments (or a list of TypeSyntax) in TheoryDataConversion and use them for array type construction too, or
  • Apply ParseTypeName consistently to both paths and accept the normalization.

2. IsMultiType creates a redundant invariant

IsMultiType is always equivalent to typeArgs.Count > 1. Storing it explicitly alongside ElementType (which already encodes the multi-type distinction via the (T1, T2) format) means there are two properties that must stay in sync. A minor breakage in either computation would produce confusing silent mismatches.

A more self-consistent model: store IReadOnlyList<string> ElementTypes and compute IsMultiType as ElementTypes.Count > 1. This eliminates the string-join for ElementType and makes the data model more honest about what it contains.

This is a non-blocking observation; the current approach is safe because both IsMultiType and ElementType are set atomically in the same branch.

3. Code-fix test for property-getter form is missing

TheoryData_MultiType_Is_Flagged uses a property with a getter body (not a field initializer). The analyzer does detect the issue correctly. However, the transformation logic in AnalyzeTheoryData only looks for VariableDeclarationSyntax as a parent when finding the object creation to annotate — so the code fix silently does nothing for that property case (no annotation is applied to the creation inside the getter body, only to the type in the property declaration).

This is pre-existing behavior (not introduced by this PR), and the original TheoryData_Is_Flagged test has the same gap. But since this PR adds a new _Is_Flagged test specifically for the multi-type case using the property-getter form, it might be worth noting in comments that the code fix for the property-getter form does not convert the object creation, only the return type in the declaration.


Verdict

The fix is correct and the new tests (including the 3-type case) provide good confidence in the N-arity path. The concerns above are non-blocking — the most actionable follow-up would be addressing the nullable annotation edge case in the ParseTypeName path if users report issues with TheoryData<string?, int> conversions.

Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Summary

This is a well-targeted, correct bug fix. The previous implementation took only TypeArgumentList.Arguments.FirstOrDefault(), producing the invalid IEnumerable<T1, T2> type syntax. The fix correctly folds multiple type arguments into a C# value tuple: IEnumerable<(T1, T2)>. The implementation is clean and follows existing patterns in the codebase.


What Works Well

Correct fix for the root cause. The original code silently discarded all type arguments beyond the first, generating code that would not even compile for consumers. This PR makes it emit valid, idiomatic C#.

Roslyn-first approach for the type declaration. In TransformTheoryData, the IEnumerable<(T1, T2)> type is built properly using SyntaxFactory.TupleType + TupleElement from the actual TypeArgumentList syntax nodes, rather than round-tripping through a string. This preserves trivia and annotations on the type arguments.

ConvertToTupleExpression is safe. The guard on SyntaxKind.ComplexElementInitializerExpression is exactly right: { val1, val2 } inside a collection initializer is always a ComplexElementInitializerExpression, and single-value entries pass through unchanged.

Separator reuse is correct. In the outer list transformation, expressions.GetSeparators() is passed to the new SeparatedList - this correctly reuses the existing commas between items since the item count is unchanged.

Test coverage is solid. Three new tests are added: a flag-only test (property getter form), a 2-type code-fix test, and a 3-type code-fix test. The 3-type test is particularly valuable because it exercises the .Select()-based N-arity path for both the tuple type construction and the ConvertToTupleExpression helper.


Minor Concerns

1. Inconsistent strategy: string round-trip for array type vs. Roslyn nodes for enumerable type

In XUnitTwoPhaseAnalyzer.cs, ElementType for the multi-type case is computed as a plain string:

var elementType = isMultiType
    ? $"({string.Join(", ", typeArgs.Select(t => t.ToString()))})"
    : typeArgs[0].ToString();

This string is then consumed via SyntaxFactory.ParseTypeName(conversion.ElementType) to build the array element type ((string, int)[]). Meanwhile, the IEnumerable<(T1, T2)> type is built using proper Roslyn syntax nodes.

The string path works in practice, but ParseTypeName on a string like "(string, int)" will normalize formatting and discard any trivia or annotations the original type arguments carried (e.g., nullable reference type annotations: string?, fully-qualified names with internal whitespace). In contrast, the Roslyn-first path for the enumerable type preserves those. This inconsistency is low-risk for the common cases tested, but worth noting for correctness under edge cases like TheoryData<string?, int>.

A cleaner approach would be to either store the actual TypeArgumentListSyntax.Arguments (or a list of TypeSyntax) in TheoryDataConversion and use them for array type construction too, or apply ParseTypeName consistently to both paths and accept the normalization.

2. IsMultiType creates a redundant invariant

IsMultiType is always equivalent to typeArgs.Count > 1. Storing it explicitly alongside ElementType (which already encodes the multi-type distinction via the (T1, T2) format) means there are two properties that must stay in sync. A minor breakage in either computation would produce confusing silent mismatches.

A more self-consistent model: store IReadOnlyList<string> ElementTypes and compute IsMultiType as ElementTypes.Count > 1. This eliminates the string-join for ElementType and makes the data model more honest about what it contains.

This is a non-blocking observation; the current approach is safe because both IsMultiType and ElementType are set atomically in the same branch.

3. Code-fix test for property-getter form is missing (pre-existing gap)

TheoryData_MultiType_Is_Flagged uses a property with a getter body (not a field initializer). The analyzer detects the issue correctly. However, the transformation logic in AnalyzeTheoryData only traverses up to VariableDeclarationSyntax when finding the object creation to annotate - so the code fix silently skips the object creation inside the getter body (only the return type in the property declaration gets annotated).

This is pre-existing behavior (not introduced by this PR), and the original TheoryData_Is_Flagged test has the same gap. But since this PR adds a new _Is_Flagged test specifically for the multi-type case using the property-getter form, it might be worth noting in a comment that the code fix does not convert the object creation inside the getter body.


Verdict

The fix is correct and the new tests (including the 3-type case) provide good confidence in the N-arity path. The concerns above are non-blocking improvement opportunities. The most actionable follow-up would be addressing the nullable annotation edge case in the ParseTypeName path if users report issues with TheoryData<string?, int> conversions.

…ntation

Address code review feedback:
- Replace ElementType string + IsMultiType flag with ElementTypes list;
  IsMultiType is now a derived property, eliminating redundant invariant
- Add shared BuildElementType() helper using Roslyn TupleType consistently
  for both IEnumerable type and array type construction (no more string
  roundtrip via ParseTypeName for tuple types)
- Fix pre-existing brace indentation bug: preserve original whitespace
  when the newline is trailing trivia of the preceding token
- Add property-getter code fix test confirming multi-type conversion
  works for the exact pattern from the issue report
Copy link
Contributor

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — Follow-Up

Previous Concerns: Status

All three concerns raised in the earlier reviews have been addressed in this update:

Concern #2 (IsMultiType redundant invariant) — RESOLVED
IsMultiType is now a pure derived property (ElementTypes.Count > 1). The data model is now self-consistent: ElementTypes is the single source of truth, and IsMultiType is a cheap derivation from it. No redundant invariant to maintain.

Missing 3-type test — RESOLVED
TheoryData_ThreeTypes_Can_Be_Converted now exercises the N-arity path for both BuildElementType (tuple type construction) and ConvertToTupleExpression (separator handling). Good.

Property-getter code-fix test — RESOLVED
TheoryData_MultiType_PropertyGetter_Can_Be_Converted confirms that the code fix does work end-to-end for the property-getter form (converting both the return type declaration and the object creation inside the getter body). The previous review's concern about this was incorrect about the limitation.


Concern #1 (Inconsistent strategy) — Addressed via Unification

The previous review flagged that the array-creation path round-tripped through strings (ParseTypeName(elementType)) while the enumerable-type path used proper Roslyn nodes from genericName.TypeArgumentList.Arguments. The new version unifies both paths through BuildElementType(conversion.ElementTypes), which calls:

SyntaxFactory.TupleType(
    SyntaxFactory.SeparatedList(
        elementTypes.Select(t =>
            SyntaxFactory.TupleElement(SyntaxFactory.ParseTypeName(t)))));

Both paths now round-trip through the individual type strings (from typeArg.ToString()). This is consistent, and for common cases it's correct. One remaining edge case: generic type arguments like Dictionary<string, int> as a TheoryData type parameter will have their internal whitespace normalized by ParseTypeName, but this is acceptable for a migration tool and the test coverage adequately exercises the realistic cases.

One micro-observation: BuildElementType(conversion.ElementTypes) is called twice in TransformTheoryData — once for the array type and once for the type declaration — producing two identical TypeSyntax trees. Since TypeSyntax is an immutable Roslyn node, there's no functional issue, but calling it once and reusing the result would be slightly cleaner. Non-blocking.


Verdict

This is a correct fix with solid test coverage and clean data model design. All three previously raised concerns are resolved. The implementation is ready to merge.

This was referenced Mar 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: xUnit TheoryData<T, U> doesn't translate into compilable code

1 participant