add tests for multi-platform end-of-line behavior #1416

betanalpha · 2015-04-01T21:58:03Z

I'm using a fresh copy of dev (both Stan and CmdStan). The new parser error line numbers look great, but they're consistently off the model I'm trying to build. For example,

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

variable "logmu_ooc_C" does not exist.

ERROR at line 124

122:      beta_theta ~ normal(0,2);
123:      
124:      alpha_mu ~ normal(0,2);
                                                           ^
125:      alpha_sigma ~ normal(0,2);

while the model itself is given by

121: beta_theta ~ normal(0,2);
122: 
123: alpha_mu ~ normal(0,2);
124: alpha_sigma ~ normal(0,2);
125: alpha_theta ~ normal(0,2);
126:
127: // Measurements
128: for(n in 1:N_ooc) {
129:   ooc_count_C[n] ~ neg_binomial_2(exp(logmu_ooc_C), exp(log_sigma_ooc_C));
130:   ooc_count_T[n] ~ neg_binomial_2(exp(log_mu_ooc_T), exp(log_sigma_ooc_T));
131: }

The parser should be identifying line 129.

Has anyone seen inconsistencies like this? If someone can confirm then I'll put together a simple reproducible example.

The text was updated successfully, but these errors were encountered:

betanalpha · 2015-04-01T22:01:13Z

Possibly comments aren't getting counted in terms of line numbers?

bob-carpenter · 2015-04-01T23:53:08Z

I haven't and just tried a few examples and it seems fine
for line numbers.

Can you give me the whole program and I can try it here?

Whatever's causing this issue would also cause runtime
numbers to be off, since they're using the same counting.

Are you running this Windows or Linux or Mac? They all have
different line ends, and that may also be messing things
up somehow.

Bob

On Apr 2, 2015, at 8:58 AM, Michael Betancourt [email protected] wrote:

I'm using a fresh copy of dev (both Stan and CmdStan). The new parser error line numbers look great, but they're consistently off the model I'm trying to build. For example,

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

variable "logmu_ooc_C" does not exist.

ERROR at line 124

122: beta_theta ~ normal(0,2);
123:
124: alpha_mu ~ normal(0,2);
^
125: alpha_sigma ~ normal(0,2);

while the model itself is given by

121: beta_theta ~ normal(0,2);
122:
123: alpha_mu ~ normal(0,2);
124: alpha_sigma ~ normal(0,2);
125: alpha_theta ~ normal(0,2);
126:
127: // Measurements
128: for(n in 1:N_ooc) {
129: ooc_count_C[n] ~ neg_binomial_2(exp(logmu_ooc_C), exp(log_sigma_ooc_C));
130: ooc_count_T[n] ~ neg_binomial_2(exp(log_mu_ooc_T), exp(log_sigma_ooc_T));
131: }

The parser should be identifying line 129.

Has anyone seen inconsistencies like this? If someone can confirm then I'll put together a simple reproducible example.

—
Reply to this email directly or view it on GitHub.

betanalpha · 2015-04-02T00:03:44Z

I’m on a Mac — runtime errors are definitely off, too.
The model/data aren’t public so I’ll try to whip up a
pathological model tomorrow.

On Apr 2, 2015, at 12:53 AM, Bob Carpenter [email protected] wrote:

I haven't and just tried a few examples and it seems fine
for line numbers.

Can you give me the whole program and I can try it here?

Whatever's causing this issue would also cause runtime
numbers to be off, since they're using the same counting.

Are you running this Windows or Linux or Mac? They all have
different line ends, and that may also be messing things
up somehow.

Bob

On Apr 2, 2015, at 8:58 AM, Michael Betancourt [email protected] wrote:

I'm using a fresh copy of dev (both Stan and CmdStan). The new parser error line numbers look great, but they're consistently off the model I'm trying to build. For example,

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

variable "logmu_ooc_C" does not exist.

ERROR at line 124

122: beta_theta ~ normal(0,2);
123:
124: alpha_mu ~ normal(0,2);
^
125: alpha_sigma ~ normal(0,2);

while the model itself is given by

121: beta_theta ~ normal(0,2);
122:
123: alpha_mu ~ normal(0,2);
124: alpha_sigma ~ normal(0,2);
125: alpha_theta ~ normal(0,2);
126:
127: // Measurements
128: for(n in 1:N_ooc) {
129: ooc_count_C[n] ~ neg_binomial_2(exp(logmu_ooc_C), exp(log_sigma_ooc_C));
130: ooc_count_T[n] ~ neg_binomial_2(exp(log_mu_ooc_T), exp(log_sigma_ooc_T));
131: }

The parser should be identifying line 129.

Has anyone seen inconsistencies like this? If someone can confirm then I'll put together a simple reproducible example.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub.

betanalpha · 2015-04-02T09:56:31Z

Turns out it’s Windows newlines — the model was originally
written on a Windows box. Is there any way for the parser
to aggregate newline types and warn the user of potential
biases in the reported line numbers?

On Apr 2, 2015, at 1:03 AM, Michael Betancourt [email protected] wrote:

I’m on a Mac — runtime errors are definitely off, too.
The model/data aren’t public so I’ll try to whip up a
pathological model tomorrow.

On Apr 2, 2015, at 12:53 AM, Bob Carpenter [email protected] wrote:

I haven't and just tried a few examples and it seems fine
for line numbers.

Can you give me the whole program and I can try it here?

Whatever's causing this issue would also cause runtime
numbers to be off, since they're using the same counting.

Are you running this Windows or Linux or Mac? They all have
different line ends, and that may also be messing things
up somehow.

Bob

On Apr 2, 2015, at 8:58 AM, Michael Betancourt [email protected] wrote:

I'm using a fresh copy of dev (both Stan and CmdStan). The new parser error line numbers look great, but they're consistently off the model I'm trying to build. For example,

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

variable "logmu_ooc_C" does not exist.

ERROR at line 124

122: beta_theta ~ normal(0,2);
123:
124: alpha_mu ~ normal(0,2);
^
125: alpha_sigma ~ normal(0,2);

while the model itself is given by

121: beta_theta ~ normal(0,2);
122:
123: alpha_mu ~ normal(0,2);
124: alpha_sigma ~ normal(0,2);
125: alpha_theta ~ normal(0,2);
126:
127: // Measurements
128: for(n in 1:N_ooc) {
129: ooc_count_C[n] ~ neg_binomial_2(exp(logmu_ooc_C), exp(log_sigma_ooc_C));
130: ooc_count_T[n] ~ neg_binomial_2(exp(log_mu_ooc_T), exp(log_sigma_ooc_T));
131: }

The parser should be identifying line 129.

Has anyone seen inconsistencies like this? If someone can confirm then I'll put together a simple reproducible example.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub.

bob-carpenter · 2015-04-02T12:35:11Z

What do you want the behavior to be?

The line counting is happening below the level of the parser
in the iterator that Spirit uses. And it'd be hard to change
there. But the good news is we have the whole string representing
the program before it gets used to create the iterators.

Here's the iterator getting created in lang/parser.hpp:

  std::ostringstream buf;
  buf << input.rdbuf();
  std::string stan_string = buf.str();

  // ***** could add warnings or preprocessing here *****

  typedef std::string::const_iterator input_iterator;
  typedef boost::spirit::line_pos_iterator<input_iterator> lp_iterator;

  lp_iterator fwd_begin = lp_iterator(stan_string.begin());
  lp_iterator fwd_end = lp_iterator(stan_string.end());

It'd be easy to drop code in where I wrote the comment above.

Bob

On Apr 2, 2015, at 8:56 PM, Michael Betancourt [email protected] wrote:

Turns out it’s Windows newlines — the model was originally
written on a Windows box. Is there any way for the parser
to aggregate newline types and warn the user of potential
biases in the reported line numbers?

On Apr 2, 2015, at 1:03 AM, Michael Betancourt [email protected] wrote:

I’m on a Mac — runtime errors are definitely off, too.
The model/data aren’t public so I’ll try to whip up a
pathological model tomorrow.

On Apr 2, 2015, at 12:53 AM, Bob Carpenter [email protected] wrote:

I haven't and just tried a few examples and it seems fine
for line numbers.

Can you give me the whole program and I can try it here?

Whatever's causing this issue would also cause runtime
numbers to be off, since they're using the same counting.

Are you running this Windows or Linux or Mac? They all have
different line ends, and that may also be messing things
up somehow.

Bob

On Apr 2, 2015, at 8:58 AM, Michael Betancourt [email protected] wrote:

I'm using a fresh copy of dev (both Stan and CmdStan). The new parser error line numbers look great, but they're consistently off the model I'm trying to build. For example,

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

variable "logmu_ooc_C" does not exist.

ERROR at line 124

122: beta_theta ~ normal(0,2);
123:
124: alpha_mu ~ normal(0,2);
^
125: alpha_sigma ~ normal(0,2);

while the model itself is given by

121: beta_theta ~ normal(0,2);
122:
123: alpha_mu ~ normal(0,2);
124: alpha_sigma ~ normal(0,2);
125: alpha_theta ~ normal(0,2);
126:
127: // Measurements
128: for(n in 1:N_ooc) {
129: ooc_count_C[n] ~ neg_binomial_2(exp(logmu_ooc_C), exp(log_sigma_ooc_C));
130: ooc_count_T[n] ~ neg_binomial_2(exp(log_mu_ooc_T), exp(log_sigma_ooc_T));
131: }

The parser should be identifying line 129.

Has anyone seen inconsistencies like this? If someone can confirm then I'll put together a simple reproducible example.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub.

betanalpha · 2015-04-02T13:01:26Z

So are the Windows newlines messing up Spirit directly?
Or is it having a mix of Unix and Windows newlines?

In the former case having a comment like “I detected Windows newlines
so be careful with the error message” would be huge given
the relatively large number of Windows users that we have.
In the latter having a “I found mixed newlines, be careful…”
would be great.

On Apr 2, 2015, at 1:35 PM, Bob Carpenter [email protected] wrote:

What do you want the behavior to be?

The line counting is happening below the level of the parser
in the iterator that Spirit uses. And it'd be hard to change
there. But the good news is we have the whole string representing
the program before it gets used to create the iterators.

Here's the iterator getting created in lang/parser.hpp:

std::ostringstream buf;
buf << input.rdbuf();
std::string stan_string = buf.str();

// ***** could add warnings or preprocessing here *****

typedef std::string::const_iterator input_iterator;
typedef boost::spirit::line_pos_iterator<input_iterator> lp_iterator;

lp_iterator fwd_begin = lp_iterator(stan_string.begin());
lp_iterator fwd_end = lp_iterator(stan_string.end());

It'd be easy to drop code in where I wrote the comment above.

Bob

On Apr 2, 2015, at 8:56 PM, Michael Betancourt [email protected] wrote:

Turns out it’s Windows newlines — the model was originally
written on a Windows box. Is there any way for the parser
to aggregate newline types and warn the user of potential
biases in the reported line numbers?

On Apr 2, 2015, at 1:03 AM, Michael Betancourt [email protected] wrote:

I’m on a Mac — runtime errors are definitely off, too.
The model/data aren’t public so I’ll try to whip up a
pathological model tomorrow.

On Apr 2, 2015, at 12:53 AM, Bob Carpenter [email protected] wrote:

I haven't and just tried a few examples and it seems fine
for line numbers.

Can you give me the whole program and I can try it here?

Whatever's causing this issue would also cause runtime
numbers to be off, since they're using the same counting.

Are you running this Windows or Linux or Mac? They all have
different line ends, and that may also be messing things
up somehow.

Bob

On Apr 2, 2015, at 8:58 AM, Michael Betancourt [email protected] wrote:

I'm using a fresh copy of dev (both Stan and CmdStan). The new parser error line numbers look great, but they're consistently off the model I'm trying to build. For example,

SYNTAX ERROR, MESSAGE(S) FROM PARSER:

variable "logmu_ooc_C" does not exist.

ERROR at line 124

122: beta_theta ~ normal(0,2);
123:
124: alpha_mu ~ normal(0,2);
^
125: alpha_sigma ~ normal(0,2);

while the model itself is given by

121: beta_theta ~ normal(0,2);
122:
123: alpha_mu ~ normal(0,2);
124: alpha_sigma ~ normal(0,2);
125: alpha_theta ~ normal(0,2);
126:
127: // Measurements
128: for(n in 1:N_ooc) {
129: ooc_count_C[n] ~ neg_binomial_2(exp(logmu_ooc_C), exp(log_sigma_ooc_C));
130: ooc_count_T[n] ~ neg_binomial_2(exp(log_mu_ooc_T), exp(log_sigma_ooc_T));
131: }

The parser should be identifying line 129.

Has anyone seen inconsistencies like this? If someone can confirm then I'll put together a simple reproducible example.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub.

—
Reply to this email directly or view it on GitHub.

bob-carpenter · 2015-04-02T13:31:10Z

I think the parser/iterator should treat end-of-line markers
from any OS as end-of-line markers. That ensures we get the
same behavior from the same byte stream cross platform.

I'd guess that's what's happening now, but we need to add
tests as part of resolving this issue and to figure out what's
going on.

add tests for behavior with mixed end-of-line markers

Then we can follow your second suggestion:

provide a warning if there are mixed end-of-line markers

I can take on doing both of these.

Bob

VMatthijs · 2018-12-13T13:22:17Z

This is still an issue in stanc2, from my experience.

It's been fixed in stanc3, however.

syclik modified the milestone: Future Jun 11, 2015

syclik added windows bug language labels Nov 30, 2016

bob-carpenter self-assigned this May 16, 2017

bob-carpenter modified the milestones: 2.15.0++, Future May 16, 2017

seantalts modified the milestones: 2.16.0, 2.17 Jul 5, 2017

seantalts modified the milestones: 2.17.0, 2.17.0++ Sep 6, 2017

bob-carpenter assigned mitzimorris and unassigned bob-carpenter Jun 3, 2018

bob-carpenter changed the title ~~Parser Error Line Numbers Off?~~ add tests for multi-platform end-of-line behavior Jun 3, 2018

seantalts modified the milestones: 2.18.0, 2.18.0++ Jul 13, 2018

mitzimorris removed this from the 2.18.1 milestone Dec 23, 2018

mitzimorris added this to the 2.18.1++ milestone Dec 23, 2018

alashworth mentioned this issue Mar 12, 2019

add tests for multi-platform end-of-line behavior alashworth/test-issue-import#66

Open

seantalts modified the milestones: 2.18.1++, 2.19.0++ Mar 20, 2019

serban-nicusor-toptal modified the milestones: 2.20.0++, 2.21.0 Oct 18, 2019

serban-nicusor-toptal modified the milestones: 2.22.0++, 2.23.0++ Apr 22, 2020

rok-cesnovar closed this as completed May 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add tests for multi-platform end-of-line behavior #1416

add tests for multi-platform end-of-line behavior #1416

betanalpha commented Apr 1, 2015

betanalpha commented Apr 1, 2015

bob-carpenter commented Apr 1, 2015

betanalpha commented Apr 2, 2015

betanalpha commented Apr 2, 2015

bob-carpenter commented Apr 2, 2015

betanalpha commented Apr 2, 2015

bob-carpenter commented Apr 2, 2015

VMatthijs commented Dec 13, 2018

add tests for multi-platform end-of-line behavior #1416

add tests for multi-platform end-of-line behavior #1416

Comments

betanalpha commented Apr 1, 2015

betanalpha commented Apr 1, 2015

bob-carpenter commented Apr 1, 2015

betanalpha commented Apr 2, 2015

betanalpha commented Apr 2, 2015

bob-carpenter commented Apr 2, 2015

betanalpha commented Apr 2, 2015

bob-carpenter commented Apr 2, 2015

VMatthijs commented Dec 13, 2018