Asked  1 Year ago    Answers:  5   Viewed   8 times

I would like to do some 2-dimensional walks using strings of characters by assigning different values to each character. I was planning to 'pop' the first character of a string, use it, and repeat for the rest of the string.

How can I achieve something like this?

x <- 'hello stackoverflow'

I'd like to be able to do something like this:

a <- x.pop[1]

print(a)

'h'
print(x)

'ello stackoverflow'

 Answers

1

See ?substring.

x <- 'hello stackoverflow'
substring(x, 1, 1)
## [1] "h"
substring(x, 2)
## [1] "ello stackoverflow"

The idea of having a pop method that both returns a value and has a side effect of updating the data stored in x is very much a concept from object-oriented programming. So rather than defining a pop function to operate on character vectors, we can make a reference class with a pop method.

PopStringFactory <- setRefClass(
  "PopString",
  fields = list(
    x = "character"  
  ),
  methods = list(
    initialize = function(x)
    {
      x <<- x
    },
    pop = function(n = 1)
    {
      if(nchar(x) == 0)
      {
        warning("Nothing to pop.")
        return("")
      }
      first <- substring(x, 1, n)
      x <<- substring(x, n + 1)
      first
    }
  )
)

x <- PopStringFactory$new("hello stackoverflow")
x
## Reference class object of class "PopString"
## Field "x":
## [1] "hello stackoverflow"
replicate(nchar(x$x), x$pop())
## [1] "h" "e" "l" "l" "o" " " "s" "t" "a" "c" "k" "o" "v" "e" "r" "f" "l" "o" "w"
Tuesday, June 1, 2021
 
Jesse
 
2

OK, based on refinements to your question, what you probably want is ltrim.

$out = ltrim($in, "0");

This will strip all leading zeroes from $in. It won't remove zeroes from anywhere else, and it won't remove anything other than zeroes. Be careful; if you give it "000" you'll get back "" instead of "0".

You could use typecasting instead, as long as $in is always a number (or you want it to result in 0 if it isn't):

$out = (int) $in;
  • 007 becomes 7
  • 000 becomes 0
  • 100 stays as 100
  • 456 stays as 456
  • 00a becomes 0
  • 56a becomes 0
  • ab4 becomes 0
  • -007 becomes -7

...etc.

Now, in the unlikely event that you only want to replace the first 0, so for example "007" becomes "07", then your latest attempt mentioned in your question is almost there. You just need to add a "caret" character to make sure it only matches the start of the string:

$out = preg_replace('/^0/', '', $in);
Thursday, April 1, 2021
3

Method 1

You can use grepl with an appropraite regular expression. Consider the following:

x <- c("blank","wade","waste","rubbish","dedekind","bated")
grepl("^.+(de|te)$",x)
[1] FALSE  TRUE  TRUE FALSE FALSE FALSE

The regular expression says begin (^) with anything any number of times (.+) and then find either de or te ((de|te)) then end ($).

So for your data.frame try,

subset(PVs,grepl("^.+(de|te)$",Word))

Method 2

To avoid the regexp method you can use a substr method instead.

# substr the last two characters and test
substr(x,nchar(x)-1,nchar(x)) %in% c("de","te")
[1] FALSE  TRUE  TRUE FALSE FALSE FALSE

So try:

subset(PVs,substr(Word,nchar(Word)-1,nchar(Word)) %in% c("de","te"))
Thursday, June 3, 2021
 
2

As others have mentioned, you cannot override the sealed S4 method "+". However, you do not need to define a new class in order to define an addition function for strings; this is not ideal since it forces you to convert the class of strings and thus leading to more ugly code. Instead, one can simply overwrite the "+" function:

"+" = function(x,y) {
    if(is.character(x) || is.character(y)) {
        return(paste(x , y, sep=""))
    } else {
        .Primitive("+")(x,y)
    }
}

Then the following should all work as expected:

1 + 4
1:10 + 4 
"Help" + "Me"

This solution feels a bit like a hack, since you are no longer using formal methods but its the only way to get the exact behavior you wanted.

Sunday, June 6, 2021
 
5

Yes. Strings can be seen as character arrays, and the way to access a position of an array is to use the [] operator. Usually there's no problem at all in using $str[0] (and I'm pretty sure is much faster than the substr() method).

There is only one caveat with both methods: they will get the first byte, rather than the first character. This is important if you're using multibyte encodings (such as UTF-8). If you want to support that, use mb_substr(). Arguably, you should always assume multibyte input these days, so this is the best option, but it will be slightly slower.

Saturday, June 12, 2021
 
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :