Skip to content

Strong typing#83

Open
hmgaudecker wants to merge 27 commits intomainfrom
strong-typing
Open

Strong typing#83
hmgaudecker wants to merge 27 commits intomainfrom
strong-typing

Conversation

@hmgaudecker
Copy link
Member

@hmgaudecker hmgaudecker commented Jan 8, 2026

Refactor model configuration to use frozen dataclasses instead of dicts

Introduce strongly-typed dataclasses for model configuration:

  • ModelSpec instance replaces the nested dicts as the main entry point (also remove yaml files)
  • Dimensions, Labels, Anchoring, EstimationOptions, TransitionInfo
  • FactorEndogenousInfo, EndogenousFactorsInfo

🤖 Generated with Claude Code

Introduce strongly-typed dataclasses for model configuration:
- Dimensions, Labels, Anchoring, EstimationOptions, TransitionInfo
- FactorEndogenousInfo, EndogenousFactorsInfo

This improves type safety and enables IDE autocompletion while keeping
user-facing model_dict as a plain dictionary.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@hmgaudecker hmgaudecker changed the title Move to strong tzping Strong typing Jan 8, 2026
@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

❌ Patch coverage is 91.40000% with 129 lines in your changes missing coverage. Please review.
✅ Project coverage is 90.32%. Comparing base (dc87cf5) to head (7cf471a).

Files with missing lines Patch % Lines
src/skillmodels/model_spec.py 59.84% 51 Missing ⚠️
src/skillmodels/correlation_heatmap.py 48.14% 28 Missing ⚠️
src/skillmodels/utilities.py 90.69% 8 Missing ⚠️
src/skillmodels/simulate_data.py 83.33% 7 Missing ⚠️
src/skillmodels/visualize_factor_distributions.py 85.10% 7 Missing ⚠️
src/skillmodels/check_model.py 87.50% 6 Missing ⚠️
src/skillmodels/types.py 94.11% 6 Missing ⚠️
src/skillmodels/visualize_transition_equations.py 90.00% 5 Missing ⚠️
src/skillmodels/process_data.py 92.85% 3 Missing ⚠️
src/skillmodels/process_model.py 96.42% 3 Missing ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #83      +/-   ##
==========================================
- Coverage   91.02%   90.32%   -0.70%     
==========================================
  Files          42       47       +5     
  Lines        3587     3919     +332     
==========================================
+ Hits         3265     3540     +275     
- Misses        322      379      +57     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hmgaudecker and others added 3 commits January 8, 2026 19:29
Replace dict fields with frozendict in frozen dataclasses to ensure
true immutability:
- Labels.aug_periods_to_periods
- Labels.aug_stages_to_stages
- Anchoring.outcomes
- TransitionInfo.param_names, individual_functions, function_names
- EndogenousFactorsInfo.aug_periods_to_aug_period_meas_types, factor_info

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update process_model() to return a ProcessedModel frozen dataclass
and update all consumers to use attribute access instead of dict access.

This provides:
- Better type safety with explicit typed fields
- Immutability via frozen dataclass
- IDE autocomplete support
- Clear documentation of the model structure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@hmgaudecker hmgaudecker marked this pull request as draft January 9, 2026 04:52
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@hmgaudecker
Copy link
Member Author

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

@hmgaudecker hmgaudecker marked this pull request as ready for review January 30, 2026 12:50
@hmgaudecker hmgaudecker requested a review from janosg January 30, 2026 12:51
```python
from skillmodels import (
AnchoringSpec,
EstimationOptionsSpec,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better just EstimationSpec


## Implementation Status

None of these correction methods are currently implemented in skillmodels. Users
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably outdated? You implemented Endogeneity correction

Normalizations,
)

MODEL2 = ModelSpec(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite impressed you well Claude decided which things should be dataclasses and where we keep dictionaries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will be a thing where I'll mostly take credit for the vibes at least. Needed several iterations.


## Dimensions

The `Dimensions` dataclass contains integer values for model dimensions:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These section can now replaced with autodoc / autoclass; That's one of the benefits of strong typing! Results are best if attribute docstrings come right after the attribute like here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Limitation of jb book for now, we can convert once available.

from jax import Array


def _make_immutable(value: object) -> object:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know where this is used but I tend to keep dicts and lists for data that should not be represented as a dataclass. It's more intuitive for most Python users and using tuples for lists of variables does not work well with pandas.



@dataclass(frozen=True)
class EstimationOptions:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaults should now be set in the dataclasses which gives us a much nicer way to document them. Again, another main benefit of strongly typed code. Setting defaults when everything is a dict is very annoying and it's now probably a bit scattered throughout the processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants